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ABSTRACT 


An  enumeration  al?oritnm  nhicn  synthesizes  programs  from 
example  computations  is  presented.  Tne  algorithm,  originally 
proposed  by  Alan  rf.  Biermann  or  EuKe  University,  assigns  a 
labelling  of  tne  instructions  contained  in  an  example  trace 
consistent  with  producing  minimum  state  Moore  macnine 
representations  for  tne  syntnesizea  programs.  Tecnniques  for 
processing  tne  information  to  reduce  enumeration  are  given. 
Biermann's  algoritnm  is  extended  by  trace  preprocessing 
techniques  which  identify  and  generalize  conditions  on 
instruction  sequencine  in  tne  synthesized  programs  without 
tne  user's  assistance.  Tne  tecnniques  are  presented  using 
text  editing  as  the  domain,  but  are  general  enough  to  be 
extendable  into  other  domains. 
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I.  INTRODUCTION 


A.  BACKGROUND 

Since  trie  introduction  of  electronic  computing  macnir.es, 
manual  tasss  tnat  are  mundane,  tedious  and/or  repetitious 
nave  been  considered  for  automation.  Tne  computer  is  iaeaily 
suited  for  tnis  type  worn  since  it  neitner  complains  of 
boredom  nor  zanders  from  its  assigned  taste.  Tne  macnine 
meticulously  sequences  tnrougn  a  series  of  computations  over 
and  over,  producing  answers  consistent  witnin  tne 
limitations  of  tne  nardvare.  As  consistent  as  tne  computer 
is  at  performing  tasfcs,  assigning  tne  tasics  is  still  left  to 
tne  user  of  tne  system. 

Programming  tne  early  macnines  was  a  difficult  cnore. 
Communications  between  man  and  macnine  were  only 
accomplishable  tnrougn  tne  language  of  tne  machine.  Tnis 
macnine  language  consisted  of  binary  coded  macnine 
operations.  Tne  efficient  macnine  language  programmer  rad  to 
memorize  tnese  codes  or  xeep  a  list  of  tne  codes  close  by. 
All  control  transfer  points  nad  to  be  coded  in  absolute 
macnine  addresses  wnicn  tne  programmer  calculated  oy  nand.  A 
programmmer  nad  to  interpret  tne  binary  representation  of 
tne  macnine  operations  to  determine  tne  cause  of  errors  in 
programs.  Tnere  were  no  diagnostic  messages  to  aid  tne  user 
in  isolating  errors.  Tne  difficulty  of  programming  in 
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machine  language  led  to  a  searon  to  finl  better  ways  cf 
deaerating  programs.  Tne  first  step  was  tne  recognition  tnat 
tfle  computer  was  a  good  boosseeper.  capable  of  computing 
absolute  addresses  from  labels  and  translating  mnemonic 
representations  of  macnine  operation  codes.  Webster's  New 
Word  Dictionary,  Second  Edition,  defines  mnemonic  to  be,  "a 
system  or  technique  of  improving  memory  by  tne  use  of 
certain  formulas.”  Soon  programs  were  written  wnicn  would 
accept  abstract  programs  containing  mnemonics  and  labels, 
convert  tne  mnemonics  into  macnine  operation  codes  and 
translate  tne  labels  Into  absolute  macnine  addresses.  These 
programs  produced  executable  macnine  language  cone  as 
output.  These  translation  programs  were  called  assemblers 
and  tne  data  tney  translated  were  called  assembly  language 
programs. 

Assembly  language  provided  some  automation  of  tne  manual 
tastes  associated  wltn  machine  language  programming.  An 
important  convenience  of  assembly  language  is  tne 
readability  of  the  programs  wnen  compared  tc  macnine 
language  programs.  Tne  mnencmics  convey  tne  meaning  of  tn°ir 
function  wnile  tne  labels  relieved  tne  programmer  of 
calculating  absolute  addresses  for  control  transfer  points. 
Assemoly  language  provided  a  level  of  abstraction  wnicn 
allowed  programmers  to  concentrate  on  tne  programming 
problem  without  deaiing  with  every  atomic  macnine  operation. 
The  assembler  provided  boostceeping,  address  translation  and 
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mneumonic  decoding  fast  and  efficiently.  Programmers  were 
now  capable  of  producing  more  code  in  less  time  witn  fewer 
errors  witn  assembly  language. 

Assembly  language  eased  tne  programmers  tasx  but  it 
still  could  not  be  considered  a  panacea  for  '’omputer-r.uman 
interaction.  Assembly  language  still  required  tne  programmer 
to  maintain  control  over  many  machine  operations  and  ne  nad 
to  provide  tne  logic  to  control  tne  flow  of  program 
execution.  The  instructions  used  to  perform  control 
functions  appears  as  similar  code  fragments  in  most  programs 
written  in  assembly  language.  Tnese  code  fragments  performed 
fuctions  such  as  controlling  brancnlng  decisions  and  Keeping 
count  of  loop  indices.  When  it  was  observed  that  common  code 
fragments  appeared  across  a  wide  range  of  assembly  programs, 
it  was  recognized  tnat  tnese  code  fragments  could  te 
represented  as  a  single  instruction  and  tne  computer  could 
translate  tne  single  Instruction  into  tne  code  fragment  it 
represented.  The  programs  that  translate  tnese  complex 
instructions  are  called  compilers  or  interpeters.  Tne 
complied  or  lnterpeted  languages  that  followed  assembly 
language  In  this  evolutionary  process  incorporated  tne 
program  fragments  as  a  single  instruction  for  tne  language. 
Constructs  sucn  as  FOR,  DO  WHILE  and  IF  THEN  are  examples  cf 
nigner  level  control  structure  implementation. 

FORTRAN  was  the  first  in  a  long  line  of  higher  level 
languages.  FORTRAN  differed  from  tne  otners  by  becoming 


endeared  to  a  family  of  users  and  the  language  endures  today 
as  one  of  tne  most  frequently  used  higher  level  languages. 
What  dualities  of  tne  language  produced  this  popularity? 

The  FORTRAN  language  is  attributed  to  John  Baclrus.  Pis 
primary  goal  wnen  designing  tne  language  was  to  mane  tne 
language  resemble  tne  notation  used  in  nign  school  algeDra. 
Since  tne  notation  used  in  nign  scnool  algebra  was  familiar 
to  a  wide  audience,  FORTRAN  gave  a  friendly  appearance.  The 
language's  apparent  simplicity  is  tne  endearing  quality  of 
FORTRAN.  Some  other  language  implementors  failed  to 
recognize  this  point  and  tneir  languages  never  received  wide 
acceptance.  ALOOL  is  an  example  of  a  powerful  language  tnat 
never  received  tne  acceptance  anticipated. 

Otner  programming  languages  tnat  followed  added  compart 
representation  of  otner  recurring  program  fragments.  Tne 
higner  level  constructs  were  not  limited  to  control 
structures  but  also  included  constructs  for  data 
manipulation  functions.  Iverson's  [lj  APL  (A  Programming 
Language)  provided  powerful  operators  capable  of  performing 
complex  functions  sucn  as  matrix  multiplication  in  one 
Instruction. 

This  trend  continues  today.  Many  of  tne  newer  languages 
Implement  sopnistlcated  and  powerful  operators  and  control 
structures.  Some  of  these  languages  are  for  a  select  segment 
of  computer  users.  Intended  for  application  to  a  particular 
domain.  The  users  are  expected  to  be  familiar  with  the 
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domain,  so  tne  form  of  tne  language  snould  ce  familiar  to 
tne  user  also.  A  problem  with  a  domain  specific  language  is 
its  lnaoillty  to  adapt  to  otner  areas.  To  nor*  in  anotner 
area  tne  user  must  become  familiar  with  anotner  language.  A 
pnenomenon  demonstrated  by  many  computer  users  is  a 
reluctance  to  adapt  themselves  and  learn  a  new  language  tnat 
may  be  more  appropriate  for  a  given  tasic.  Either  they  create 
tne  egg  witn  a  sledge  hammer  or  dig  tne  well  with  a  spoon. 
When  required  to  use  a  new  language,  the  user  will  lively 
use  only  a  small  subset  of  tne  language  tnat  is  capable  of 
doing  the  job.  Worst  tnan  using  only  a  subset  of  tne 
language  features  is  tne  tendency  to  bring  old  programming 
styles  applicable  to  tne  old  language  into  tne  new  language. 
Tfte  point  tnat  is  to  be  made  is  that  learning  a  new 
programming  language  is  a  nard  cnore  and  is  avoided  whenever 
possible. 

Anotner  direction  which  tne  automation  of  programming 
tastes  nas  tatcen  is  tne  development  of  a  programming 
envl ronment .  A  programming  environment  automates  some  of  the 
manual  cnores  by  providing  the  user  with  aids  that  assist 
him  in  constructing  programs.  The  environment  includes  a 
programming  language,  an  interactive  syntax-directed  editor 
and  an  on-line  debugger.  Tne  editor  provides  syntax  error 
diagnostics  wnlie  tne  programmer  is  creating  tne  source 
file.  The  programmer  is  forced  to  correct  the  syntax  error 
Immediately  before  tne  editor  will  allow  aim  to  continue 
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proerammine.  The  error  snould  Be  readily  apparent  to  tr.e 
programmer  Because  it  is  in  tne  latest  input.  Tne  on-iir.e 
debu**er  allows  tne  programmer  to  actively  test  nis  program, 
nalt  execution,  cnect  tne  value  of  varlaDies,  change  tne 
value  of  variaoies  or  cnange  tne  code  itself.  Program 
environment  systems  may  even  allow  tne  programmer  to  switch 
from  tne  tne  editor  to  tne  on-line  debugger  and  bacit  at  ar.y 
time.  A  proerammine  environment  can  Be  summarized  as  a 
friendly  interface  utilizing  an  intelligent  editor  which  can 
recognize  syntax  errors  in  the  associated  programming 
language  and  one  tnat  contains  otner  Interactive  programming 
tools. 

Programming  nas  been  called  an  art  form  requirm? 
intellectual  creativity.  Tne  automation  of  intellectual 
behavior  is  a  field  of  study  witnin  Computer  Science  called 
Artificial  Intelligence.  Tne  study  of  tne  automation  of 
programming  tasics  which  require  human-line  reasoning  is 
called  Program  Syntnesis  or  Automatic  Programming.  It  is  net 
our  intention  to  provide  a  definition  of  intelligent 
behavior  for  a  macnine  since  tnere  is  considerable 
disagreement  even  among  tne  experts.  However,  we  note  that 
tne  goal  of  researen  in  automatic  programming  is  tne  same 
goal  that  led  to  all  tne  advances  in  programming  ian*ua«es. 
Informally,  tnis  goal  is  to  mate  tne  interaction  between  nan 
and  computer  as  painless  as  possible.  Tnat  is,  painless  for 
tne  man  but  not  necessarily  for  tne  computer.  Dijnstra  [2J 


objects  to  our  automation  or  programming  by  claiming,  “tie 
should  not  automate  programming  even  ir  we  can,  because  It 
would  tase  away  our  enjoyment  of  the  taste.”  '*’e  note  tnere 
are  tnose  wno  may  require  tne  use  of  computer  services  tnat 
nave  neitner  tne  time  nor  inclination  to  obtain  tne  required 
education  to  do  tnat  cnore.  Tnese  include  professions  such 
as  lawyers,  pnysiclans,  and  even  tneoreticai  pcysicists.  We 
assume,  if  programming  becomes  fully  automated,  the 
programmers  will  tnen  turn  tneir  attention  toward  otner 
creative  and  stimulating  pursuits.  R.  Ramming  nas  said,  "Tne 
purpose  of  computing  is  insignt  not  numbers." 

Many  on-going  efforts  are  aimed  at  providing  better 
systems  for  tne  user  so  ne  may  create  programs  faster,  with 
less  errors  and  witn  less  effort.  Tne  nistory  of  programming 
language  development  nas  snown  tnat  automation  of  many 
programming  tasss  is  feasible.  How  muen  more  of  tne 
programming  tasirs  can  be  automated?  What  would  be  considered 
tne  ultimate  system  for  producing  computer  programs? 

JB.  AUTOMATIC  PROGRAMMING 
l .  General 

Program  synthesis  or  automatic  programming  is  a 
researen  topic  concerned  witn  tne  development  of  systems 
that  provide  more  and  more  automation  of  the  programming 
process,  particularly  tnose  tasfcs  requiring  numan-liice 
reasoning.  Tne  goal  is  not  to  create  systems  tnat  program 
themselves,  but  to  create  systems  which  can  construct,  under 
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tne  direction  of  a  user,  programs  tnat  can  perform  some 
function  ne  desires.  Tnese  systems  must  be  easy  to  use,  easy 
to  learn,  and  increase  tfte  efficiency  of  tne  user.  Tfte  users 
of  tnese  systems  will  no  longer  De  restricted  to  tne  few 
computer  professionals,  out  will  include  otner  professional 
fields  as  well  as  non-professionals.  Automatic  prcgrammir.fr 
systems  are  to  interact  wi tn  tne  user,  recognize 
requirements,  and  tnen  syntnesize  a  correct  program  tnat 
satisfies  tne  requirements. 

Two  questions  arise  in  tne  researcn  on  automatic 
programming.  First,  wnat  is  tne  form  of  tne  interaction 
between  tne  user  and  tne  system?  Tnis  question  is  caned  tne 
specification  problem  because  It  is  concerned  witn  issues 
relatln*  to  now  tne  user  is  to  inform  tne  system  of  nis 
requirements.  Tne  second  question  is,  given  a  specification 
metnod,  wnat  syntnesis  tecnnlque  is  available  to  be  applied 
tnat  will  transform  tne  specification  into  an  appropriate 
program.  Tne  tecnnlque  used  for  syntnesis  is  often  dependent 
upon  tne  form  of  tne  problem  speclfi cation  and  mcst  of  tne 
projects  involving  automatic  programming  consider  botn 
problems  to*etner.  It  nas  been  proposed  by  Green  13J  tnat 
tne  two  questions  suouid  be  separated  witn  researcn 
proceeding  concurrently  on  botn  problems.  He  proposes  tnere 
is  a  standard  intermediate  representation  of  tne  problem 
specification  wnicn  would  permit  interaction  between  tne  two 
problems. 
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Four  techniques  nave  oeen  proposed  f  or  tne 
specification  problem  wnich  dominate  tne  literature  on 
automatic  programming.  Sacn  of  tne  proposed  techniques  of 
problem  specification  introduce  a  different  approacn  to  tne 
syntnesis  prooiem.  Tne  four  specification  tecnniques  can  be 
categorized  as  follows: 

1.  Natural  Laneuage. 

2.  Formal  Problem  Specification. 

3.  Input-output  Pairs. 

4.  Example  Computations. 

Eact  of  tnese  specification  tecnniques  will  be  dicussed  in 
tne  following  subsections  and  tne  reiationsnip  to  a 
syntnesis  approacn  will  he  discussed. 

2.  Problem  Specification  wlta  Natural  Language 

A  visionary  approach  to  the  specification  problem  is 
the  use  of  natural  language.  Natural  language  provides  a 
fast,  comfortable  metnoa  of  communication  wnicn  is  already 
understood  by  numans.  Implementation  of  a  natural  lar.?ua?e 
understanding  system  nas  proven  to  be  a  very  difficult 
pro olem  (Class  l4j  ) . 

Two  forms  of  natural  language  are  tne  sposen  form 
and  the  written  form.  Understandin*  spotten  language 
Increases  tne  degree  of  difficulty  because  tne  communi cation 
is  in  the  form  of  audio  waves.  Once  the  audio  input  is 
captured,  it  must  be  converted  into  another  form  for  further 
syntactic  and  semantic  analysis.  The  reader  will  note  tnat 
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once  the  audio  input  aas  been  captured  ana  converted  tne 
problem  of  written  and  spotcen  language  Decones  tne  same. 
Tnat  Is,  tne  Internal  representation  of  tne  spoiren  and 
written  word  can  be  tne  same  and  tne  problem  becomes  one  of 
inferring  meaning  from  tne  representation.  Future  advances 
in  voice  understanding  nardware  can  be  expected  and  tnese 
advances  may  be  expected  to  find  tneir  way  into  use. 

A.  complete  natural  language  understanding  system 
would  be  expected  to  be  able  to  understand  all  grammatically 
correct  sentences.  However,  natural  languages  do  net  nave 
finite  grammars.  Tnis  complexity  implies  a  complete 
understanding  system  cannot  De  implemented.  However,  a 
system  capable  of  understanding  a  subset  of  natural  language 
can  prove  useful  in  specific  domains.  Early  examples  of 
programming  tnrougn  natural  language  dialogue  is  presented 
in  a  survey  by  Reilorn  (5] .  Current  wort  on  understanding 
natural  language  may  be  found  in  Biermann  [5J  ,  and  Walicpr 
I?]  . 

In  conclusion  natural  language  understanding  is  a 
difficult  problem  tnat  can  be  solvel  only  in  limited 
domains.  Tne  use  of  natural  language  in  programming  has  been 
shown  to  be  possible  by  Heldorn  [pj  ,  and  by  Biermann  [6J  in 
limited  domains.  The  systems  developed  up  to  today  nave  been 
experimental  systems  and  tne  results  will  aid  in 
understanding  tne  problem.  Natural  language  programming 
systems  will  not  be  available  for  industry  for  at  least  a 
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decade.  Finally,  we  present  tne  example  Eiermann  le] 
describes  as  a  natural  language  specification  for  a  problem. 
Tnis  example  is  quoted  from  nis  paper  on  natural  language 
programming.  Its  Intent  is  to  give  a  feel  for  programming  in 
natural  language.  Tnis  example  does  not  specify  tne 
algorithm  that  is  to  be  used  although  a  natural  lan?ua?e 
programming  system  would  be  capable  of  accepting  such  a 
specif i cation. 
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3.  Formal  Problem  Specification 


The  second  technique  is  formal  specification  of  tne 

problem,  is  the  name  implies,  tne  input  is  in  a  more  rigid 

structure  than  natural  language.  Tnis  technique  allows  tee 

user  to  convey  tne  benavior  ne  desires  tne  syntnesized 

program  to  nave  without  specifying  the  algorithm  that  is  to 

be  used.  Smith  [9J  gives  tne  following  definition  for  tr.e 

form  of  a  formal  specification  of  a  problem  A. 

~A(x )  =  z  such  that  z  c  S  &  P(z,x)  wnere  x  c  D  & 

I(x)  where  D  and  S  are  the  input  and  output  data 
types  respectively,  and  I  and  Pnare  tne  input  and 
output  conditions  respectively." 

An  example  of  a  formal  problem  specification  for  a  program 

to  compute  the  inteeer  square  root  of  a  nonnegative  integer 

n  may  be  found  in  Manna  and  Waldinger  19] . 
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"sqrt(n)  <==  FIND  z  SUCH  THAT 

integer ( z  )  5.  z**2  =<  n  <  ( z  *  1  )  **  2 
iHERS  int9.?er(n>  4.  E  =<  n‘ 

In  tne  above  example  n  is  an  element  of  tne  input  data  type, 
z  is  an  element  of  tne  output  data  type,  sqrt  is  tne  problem 
name,  integer(n)  L  0  =<  n  is  tne  input  condition,  and 
integer(z)  4.  z**2  =<  n  <  (z  +  1)  **  2  is  tne  output  condition. 

Formal  problem  specification  and  its  application  to 
tne  program  syntnesis  problem  can  best  be  explained  tnrougn 
examination  of  tne  work  by  Manna  and  tfaldinger  lyj  ,  Manr.a 
and  tfaldlnger  [10J  ,  and  S m i t n  [SJ .  Altnougn  all  of  tne  wont 
is  similar  in  tnat  tne  formal  specification  is  cnanged  into 
an  appropriate  program  by  some  form  of  rewrite.  It  is 
valuable  to  differentiate  tne  approacnes  by  tneir  rewriting1 
metnods . 

Tne  first  example  is  tne  system  of  Manna  and 
Waliinger  [3J .  Tneir  system,  called  a  deductive  approacn, 
converts  tne  formal  specification  into  a  program  in  sore 
target  language.  Tneir  approacn,  "combines  tecnniaues  of 
unification,  mathematical  induction,  and  transformation 
rules  into  a  single  system."  Tne  following  is  an  brief 
explanation  of  this  conversion. 

A  structure  is  needed  to  contain  initial  and 
intermediate  results  of  the  conversion  process.  Tnis 
structure  is  call  a  sequent.  The  sequent  is  a  tableau 
containing  two  lists.  Tne  first  list  is  a  list  of  assertions 
and  tne  second  list  is  a  list  of  goals.  Each  element  in 
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eitner  list  nay  nave  an  output  expression  associates  witn 
it.  Figure  1  represents  a  sequent  as  a  table.  L’aon  rcw  in 
tne  table  nay  contain  eitner  an  assertion  or  a  goal  but  not 
both.  Figure  1  Is  tne  initial  sequent  for  tne  integer  square 
root  problem  given  above.  Tne  input  condition  nas  teen 
placed  in  tne  assertion  list  and  the  output  condition  placed 
in  tne  goal  list.  Tne  output  variable  is  associated  witn  tne 
output  condition  in  tne  output  expresssion  column.  Tnis 
initiation  action  assumes  tne  input  condition  is  true  and  a 
searcn  is  attempted  for  tne  trutn  of  tne  goal  or  output 
condition. 


sqrt(n)  <==  FIND  z  SUCH  THAT 

integer(z)  and  z**2  =<  n 

and  n  <  (z+1)  **  2 

WHERE  integer (n)  and  (i  =<  n 


1 

1 

1 

1 
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Output  ! 
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Figure  1.  Initialized  Sequent  for  the  Square  Root  Frobiem 
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During  tnis  searcn  if  tne  sequent  ever  contains  a  row  wnere 
the  assertion  can  be  trivially  snown  to  be  false  or  trie  ?oai 
snown  to  ce  true  and  if  tne  output  etpression  for  tnat  row 
contains  only  primitives  from  tne  tareet  lane-uase  then  tne 
output  expression  is  taicen  as  tne  desired  syntnesi:ed 
pro-am. 

Once  tne  tableau  is  initialized,  tne  system's 
deductive  rules  are  applied  to  tne  assertions  and  *roals.  The 
application  of  taese  rules  will  cause  tne  creation  of  new 
assertions  and  *oals  and  associated  output  expressions.  Tne 
rules  may  tnen  ne  applied  to  tne  new  goals  and  assertions 
until  tne  condition  for  a  program  is  satisfied.  Tne 
application  of  tne  rules  cnanee  tn  entries  in  tne  tableau 
wltnout  cnanglng  tne  meaning  of  tne  tableau.  We  recommend 
tnat  the  interested  reader  review  tne  orieinai  wont  for  a 
description  of  tne  rules  and  their  application. 

Tne  attraction  of  tnis  theorem-proving  tecnnique  is 
tnat  tne  resulting  program  can  be  proven  correct  Dy  the  same 
steps  used  to  create  it.  Currently  tnere  is  not  a  running 
implementation  of  tnis  technique.  One  or  tne  implementation 
questions  is  determining  wnat  rule  to  apply  at  eacn  step  in 
tne  syntnesls  process.  This  problem  can  be  viewed  as  a 
search  through  all  possible  sequences  of  rule  applications . 
This  searcn  space  may  become  astronomical  for  any  relatively 
complex  program  since  it  may  require  hundreds  of  rule 
applications,  inat  is  needed  is  a  mecnanism  tnat  can  control 
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the  search  in  a  reasonable  fashion.  The  form  or  control  ray 
be  neuristic  in  tnat  there  is  a  feel  for  wnere  a  rule  sncuid 
be  applied.  If  this  intuitive  feel  can  be  quantized,  tnen 
this  technique  may  become  practical. 

Earlier  worK  by  *anna  and  Wald mger  [  1  £? J  on  tne 
DEDALUS  automatic  programing  system  also  required  formal 
problem  specif icat ions  .  Tne  DEDALUS  system,  an  implemented 
automatic  programming  system,  utilized  only  transformation 
rules.  A  tranf ormation  rule  simply  rewrites  a  portion  cf  the 
specification  into  another  equivalent  form.  The  continuous 
application  of  these  rules  would  eventually  result  in  a 
program  in  the  target  language. 

4.  Input-Output  Pair  Specification 

Input-output  pairs  is  a  method  of  describing  a 
problem  witn  examples  of  input  and  output  behavior.  For 
example,  if  someone  wanted  to  describe  a  program  to  compute 
tne  Fibonacci  numbers  tnen  ne  could  supply  tne  input-output 
pairs. 

(1.  1) 

U,  3) 

(3.  t>) 

<5,  9) 

(8,13) 

The  goal  of  a  synthesizer  system  is  to  determine  tne 
desired  program  from  the  examples  of  the  input-cutput 
benavlor.  One  approach  is  to  enumerate  all  possible  programs 
in  the  target  language  in  order  and  test  each  program  for 
tne  desired  benavlor.  Tnat  is,  test  each  enumerated  program 
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by  giving  it  t^e  input  from  eacn  or  tne  examples  ana  see  if 
tne  program  will  give  tne  associated  output.  Tne  enumeration 
will  produce  tne  correct  program  at  sore  point  tut  you 
cannot  determine  if  an  arbitrary  program  can  produce  tne 
desired  nenavior  (see  Biermann  [  1 1 J  ^  .  Tnerefore,  tne 
following  tneorem  is  given  by  Biermann,  "Tne  programs  for 
tne  partial  recursive  functions  cannot  ce  generated  from 
sample  of  input-output  benavior."  A  large  class  of  programs 
may  be  inferred  from  examples  of  Input-output  pairs  provided 
trey  belong  to  tne  class  of  programs  wnere  tne  Salting 
problem  is  decidable.  Smitn  [12]  and  Summers  [13]  nave 
loosed  at  the  synthesis  of  LISP  programs  for  example 
input-output  pairs.  It  has  been  snown  tnat  a  restricted 
class  of  LISP  programs  can  be  synthesized  from  example  pairs 
without  enumeration  over  tne  class.  The  reader  is  invitee  tc 
review  Biermann  [14j  and  Gold  [15]  for  theoretical 
background  information. 

5.  Example  Computations 

Program  specification  using  example  computations 
allows  more  Information  to  be  obtained  from  tne  user.  An 
example  computation  is  a  sequence  of  instructions,  without 
an  explicit  control  structure,  which  the  user  provides  tie 
system  in  order  to  describe  the  benavior  he  wants  from  a 
program.  Examples  are  a  good  communication  metcoi  wnlcn 
people  use  to  describe  new  concepts  or  explain  new 
processes.  To  describe  a  problem  to  the  computer  the  user 
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uses  tne  available  instructions  ana  provides  an  example  of 
wnat  he  wants  done.  Figure  2  snows  an  example  rompu ta ti on 
tnat  demonstrates  now  to  compute  tne  first  I'd  Fibonacci 
numbers . 

In  Figure  2  tne  two  operand  instructions  fro?.  ADD) 
perform  tne  action  on  tne  two  operands  and  leave  tne  result 
in  tne  first  operand.  For  example,  if  A  =  2  and  B  =  3  tnen 
ADD  A,B  would  result  in  A  =  b  and  B  =  3.  All  of  tne 
instructions  perform  action  on  some  variables  execpt  for  tne 
START,  HALT,  and  NOTE  instruction.  START  and  HALT  flag  tne 
begin  and  end  of  tne  program  respectively.  Tne  NCTS 
instruction  is  providing  information  on  tne  reason  for  tne 
execution  of  tne  next  instruction. 

Tbis  method  of  specification  depends  on  tne  user  to 
supply  more  information  about  tne  problem,  including  tne 
aigoritnm  to  be  syntneslzed.  Tne  algorithm  is  implicitly 
defined  by  tne  example  computation  tnat  is  given.  This 
specification  technique  snould  be  contrasted  with  tne 
previous  tecnnioues.  Note  tnat  tne  formal  specification  and 
the  input-output  pair  specification  only  required  tne  user 
to  specify  tne  desired  benavior  witnout  specifying  tne 
algorithm.  Thus  it  can  be  claimed  tnat  these  two  methods 
intentionally  ignore  information  tnat  tne  user  nas,  assuming 
that  most  users  have  an  idea  of  the  form  of  the  algorithm. 


Tne  primary  contributor  to  tne  understanding  of 


program  syntnesis  oas  been  Alan  rf.  Biermann  (see  Bierxann 
and  Arisnnaswamy  [16J  and  Biermann,  Baum  and  Petry  In 
particular,  Eiermann  tl6j  provides  a  formal  definition  of  an 
aigoritnm  tnat  will  syntnesize  programs  from  example 
computations.  The  aigoritnm  and  variations  nave  provided  tne 
Dasic  structure  upon  wnica  tnis  tnesis  nas  been  developed. 
Briefly,  tne  aigoritnm  identifies  tne  conditions  tnat  may 
nave  inadvertently  (or  purposely)  been  left  out  of  tne 
computation.  A  condition  is  a  predicate  as  defined  in 
predicate  calculus.  Tnat  is,  an  entity  for  wnich  a  trutn 
value  may  be  measured.  Once  tne  omitted  conditions  nave  been 
inserted,  tne  aigoritnm  finds  a  labelling  for  tne 
instructions  suca  tnat  a  program  witn  a  minimum  number  of 
instructions  is  produced.  To  explain  tnis  labelling,  assume 
tne  instruction  ADD  A ,£  appears  in  three  different  locations 
in  an  example  computation  (see  Figure  2).  Suppose  it  was 
Known  that  there  nas  to  oe  two  occurrences  of  tne 
instruction.  Tnen  two  of  tne  instructions  could  oe  labeled 
wltn  a  1  and  tne  otner  instruction  labeled  witn  a  2  to 
indicate  that  the  instruction  labeled  2  is  different  from 
tne  instructions  labeled  l.  Finding  tne  labels  for  tne 
instructions  in  tne  example  computations  requires  an 
enumeration  search  of  all  possible  labellings.  Tne  labelling 
selected  is  tne  first  labelling  tnat  produces  a  program  tnat 
is  deterministic. 
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This  aleorltnm  is  compete  and  tne  syntnesized 
programs  are  sound.  Completeness  means  tnat  tne  algorithm 
can  synthesize  every  possioie  program.  Soundness  mean  tnat 
the  synthesize  program  will  correctly  execute  tne  example 
used  to  construct  it.  A  disadvantage  of  tms  syntnesis 
method  is  the  aleoritnm  is  an  enumeration  searcn  and  i.:  the 
worst  case  will  require  exponential  time  on  tne  length,  of 
tne  example  computation  to  find  a  solution.  Techniques  nave 
been  developed  to  speed  up  this  searcn  tnat  will  produce 
satisfactory  response  for  most  praticai  programs. 
t>.  A  Seneral  Automatic  Programmer  Design 

Before  leaving  tnls  section  on  automatic  program  we 
wisn  to  discuss  a  design  for  an  automatic  programmer  tnat 
uses  at  least  two  of  tne  specification  tecnniques.  Tne  name 
of  the  system  is  PS  I  and  was  designed  by  a  group  of 

researchers  at  Stanford's  Artificial  Intelligence 
Laboratory.  The  research  effort  was  healed  by  Cordell  Green 
[il .  Green  nas  presented  a  nigh  level  design  of  an 
autoprogrammer  tnat  identifies  some  of  tne  more  important 
areas  that  need  further  research.  Green  admits  tnat  tne 

design  was  an  effort  to  focus  attention  on  some  of  tne 

sub-areas  of  tne  overall  synthesis  problem.  His  modular 
design  does  focus  attention  on  different  aspects  of  tne 

problem.  The  design  decision  to  split  tne  overall  problem 
into  two  main  sub-problems  of  acquistion  and  syntnesis  is  of 
particular  interest.  Tnis  design  cfloice  allows  worn  to 
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proceed  concurrently  on  two  nard  problems  witn  tne  interface 
between  tne  problems  being  some  intermediate  representation 
of  tne  problem. 

PSI  is  a  inowiedge-based  program  unde  rs  tandi  r.g 
system  organized  as  a  collection  of  interacting  modules. 
Figure  3  details  tne  Men  level  modular  design  of  tne  PSI 
system.  Tne  PSI  design  divides  tne  system  into  two  groups. 
The  acquisition  group  interfaces  witn  tne  user  and  collects 
tne  specification  given  by  tne  user  wniie  tne  syntnesls 
group  produces  a  program  in  some  target  language  tnat  meets 
the  user's  requirements.  Communications  between  tne  two 
major  groups  is  tnrougn  an  intermediate  representation 
called  tne  program  model.  Tne  goal  of  tne  acquisition  group 
is  to  accept  tne  user's  specification  by  eitner  natural 
language  dialogue  or  by  traces,  and  present  a  unified  entity 
to  tne  syntnesizer  group.  Tne  implementation  of  tne 
synthesizer  group  is  then  simplified  because  of  tne 
consistent  representation  it  receives.  Since  tne  user's 
input  is  converted  into  an  intermediate  representation  tnat 
is  supplied  to  tne  syntnesizer  group,  tne  user  is  free  to 
swltcn  from  one  specification  tecnnique  to  anotner  during 
program  specification. 

Tne  overall  interaction  witn  tne  user  is  meant  to  oe 
through  natural  language  dialogue.  Since  natural  language 
understanding  is  not  currently  within  tne 


Figure  3.  PS1'*  “!oauier  Design  {.3,p.6J 


state  of  trie  art,  the  system  must  interact  in  a  subset  of 
natural  language  limited  to  a  particular  domain. 

The  system-user  interaction  is  to  appear  as  natural 
as  possible.  Tne  system  nas  been  designed  to  include  a 
mixed-initiative  dialogue  capability  which  means  tne  user  or 
tne  computer  can  assume  tne  dominant  communication  role  at 
different  times  luring  tne  discourse.  Tnis  allows  tne  user 
to  provide  as  mucn  Knowledge  as  ne  can  to  nelp  tne  syntnesis 
process  and  allows  tne  computer  to  assist  tne  user  by  as  King 
questions  or  providing  responses.  Tne  system  develops  a 
current  model  of  tne  user  and  a  model  of  tne  context  tnat 
assists  tne  system  in  determining  wnen  to  assume  tne 
initiative  and  wnat  questions  to  asK  tne  user. 

A  partial  implementation  was  completed  in  iy?b  tnat 
included  tne  syntnesis  expert  and  tne  efficiency  expert  from 
tne  syntnesis  group.  The  acquisition  group  modules  nave 
proven  to  be  a  more  difficult  assignment  and  only  portions 
of  tne  acquistion  group  nave  been  implemented.  Tne  Important 
point  of  tne  FSI  design  is  tnat  it  provides  a  modular 
division  of  tne  program  syntnesis  problem  tnat  neips  provoKe 
study  into  tnese  sub-problems. 

C.  OBJECTIVES 

Automatic  programmers,  wnicn  syntnesize  programs  from 
example  computations,  require  conditions  to  be  explicitly 
defined  by  tne  user  in  order  to  generate  programs  witn  a 
minimum  number  of  instructions.  Previous  worx  (  Biermann  and 
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The  explicit  definition  of  conditions  is  not  a  natural 
part  of  an  example  computation.  Tnat  is,  one  would  not 
normally  give  control  structure  information  wnen  using 
examples  to  explain  now  a  tass  is  to  be  performed.  Our 
objective  is  to  provide  an  environment  wnere  tne  user  may 
define  tne  tasfcs  ne  wants  accompiisnea  witnout  explicitly 
defining  tne  control  structures  tnat  specify  tne  flow  of 
execution  in  a  syntneslzed  program. 

We  will  implement  an  automatic  programming  system  based 
upon  tne  example  computation  specification  metftod  in  order 
to  study  tne  feasibility  of  Identifying  conditions  from  user 
actions.  We  limit  tnis  study  to  tne  domain  of  text  editing 
in  order  to  provide  a  well  defined  area  in  wnicn  to  wore.  It 
is  noped  tnat  tne  results  of  our  efforts  may  provide  lnsigr.t 
into  tne  overall  problem  and  generate  funner  researcn  wnicn 
will  extend  condition  Identification  to  otner  domains. 

D.  THESIS  OHSANIZmON 

The  thrust  of  this  thesis  is  the  deveiopement  of  metnods 
for  the  automatic  construction  of  conditions  necessary  for 
the  proper  synthesis  of  programs  from  example  computations. 
Example  computation  is  one  approacn  to  tne  problem  of 
program  synthesis.  Caapter  One  introduces  tne  reader  to 


program  synthesis  ana  gives  a  Brief  historical  perspective 
of  tne  evolution  of  tnis  field  of  study.  Cnapter  One  also 
provides  a  comparison  of  tne  different  proposed  approacnes 
to  tnis  problem. 

An  automatic  programmer  nas  Been  implemented  to  support 
tnis  researcn.  Tnis  symneslzer  was  developed  to  use  tne 
example  computation  metnod  for  program  specification. 
Cnapter  Two  is  a  detailed  explanation  of  our  particular 
implementation.  Cnapter  Two  includes  a  aiscussion  of 
techniques  we  nave  incorporated  in  our  implementation  which 
speed  up  tne  syntnesis  process. 

Chapter  Three  presents  our  approach  to  eeneratin* 
conditions  given  an  example  computation.  It  lescrioes 
aleoritnms  which  will  esnerate  conditions  rrom  a  sequence  of 
editor  instructions . 

Chapter  Four  discusses  tne  result  of  our  research.  A 
orlef  discussion  is  included  on  tne  merits  of  tne 
synthesizer  which  we  nave  implemented  and  recommendations 
are  given  for  potential  improvement.  Finally,  Cnapter  Four 
presents  a  review  of  our  woric  on  ileatif icatioo  and 
construction  of  condtions  from  example  computations.  Areas 
requiring  further  research  have  Been  hlenliented  and 
examples  of  possioie  applications  to  otner  domains  nave  Been 
pointed  out. 


II.  SYNTHESIZER 


A •  GOALS 

Tnere  is  a  two-fold  purpose  benind  designing  and 
building  tne  program  syntnesizer.  Toe  first  directly  relates 
to  the  usefulness  of  tne  syntnesizer.  It  is  noped  tnat  by 
"laying  tne  groundwork"  for  an  autoprogramming  syst°m,  tne 
impetus  will  be  provided  tnat  will  eventually  result  in  a 
total  automatic  programming  environment  teing  available  for 
the  user.  Tnis  environment  is  envisioned  as  an  interactive 
one  consisting  of  several  components:  an  interface  to 
provide  tne  user  witn  the  means  to  perform  example 
computations,  a  link  between  tne  interface  and  tne 
synthesizer  wnicn  records  the  user  actions  and  transmits  a 
trace  of  tnose  actions  to  tne  syntnesizer,  tne  syntnesizer 
Itself  which  produces  the  algorithm  in  some  internal  form, 
and,  finally,  a  translator  tnat  receives  tne  internal 
representation  of  the  algorithm  and  translates  it  into 
machine-readable  form  and/or  user-readable  form.  The  second 
purpose  for  wnicn  the  synthesizer  is  built  is  to  orovlde  a 
suitable  vehicle  to  be  used  in  the  main  area  of  research 
tnat  tnis  tnesls  explores.  If  an  autoprogrammer  can  generate 
correct  algorithms  from  example  computations,  how  "-urn  can 
be  done  to  relieve  tne  user  from  having  to  include  crancning 
or  looping  conditions  in  his  example  computations? 
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B.  OVERVIEW 


1.  General  Description 

An  automatic  program-nine  system  wnicn  procures 
programs  cased  upon  tne  user's  input  of  example  computations 
ft  as  a  natural  appeal.  Example  computations  are  sea'ien^es  cf 
instructions  performed  in  an  al?oritnmic  manner.  Jor 
instance,  if  tne  user  is  doing  a  matrix  multiply,  computing 
tne  entry  for  tfte  resultant  matrix  involves  tne  su-*  of 
products  from  tne  appropriate  row  and  column  of  tne 
multiplicand  and  multiplier  matrices,  respectively.  rfften 
numans  communicate  ideas  to  eacn  otner,  tne  proper  use  of 
example  computations  often  plays  a  vital  role.  It  is  nard  to 
imagine  trving  to  explain  tne  metnol  of  multiplying  t-o 
matrices  togetner,  or  trying  to  explain  tne  concept  of 
set-subset  relati onsnlps  witnout  ceing  acie  to  era*  examples 
tnat  enaance  tae  explanations.  Tais  metned  of  communication 
seems  to  be  vital  to  numan  understanding  of  airori  tr.rs . 
Since  programmers  often  use  smaii  example  computations  wr.iie 
coding  programs,  it  seems  tnat  a  logical  ap^roa-a  to 
automatic  programming  would  consist  of  tne  macnine  doing  tne 
actual  program  syntnesis  Cased  upon  example  computations 
given  by  tne  programmer. 

Program  syntnesis  is  tne  act  of  putting  instructions 
togetaer  in  suen  a  way  tnat  an  alforitnm  is  cuiit  wnicn 
accompllsnes  a  desired  taste.  Ceviously,  an  algoritnm  wnicn 
is  an  exact  replication  of  tne  sequence  of  instructions  win 
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accomplish  tne  tasK,  cut  it  is  uninteresting  since  it  cannot 
be  generalized  to  accomplish  a  set  of  related  tasfcs.  For 
example,  a  linear  sequence  of  instructions  wnicn  multiplies 
two  2x2  matrices  together  will  only  wont  for  2x2 
matrices*  nowever,  by  allowing  loop  constructs  and  if-tnen 
constructs,  an  algorithm  can  he  produced  which  performs  tne 
more  general  taste  of  multiplying  any  two  matrices  with  legal 
row  and  column  dimensions.  So,  in  the  case  of  tne  matrix 
multiply,  tne  taste  of  the  program  synthesizer  is  to  produce 
a  general  matrix  multiply  algorithm  given  tne  example 
computation  for  a  2  x  2  matrix  multiplication  in  some  form 
such  as: 


c[l.lj  =  a  [l  ,lj 

9 

h  [1.1J 

♦  a[l,2J 

*  t>  [2  ,lj 

c[l,2j  *  atl.l] 

9 

b  LI  ,2J 

♦  all  ,2J 

*  bl2,2j 

c[2,lj  =  a[2,l] 

V 

t>U,U 

*  a  [2 ,2J 

*  ° [2 ,lj 

c [2 ,2]  =  a  12,1] 

9 

b  L  l  *  2  J 

♦  al2,2j 

*  h  12 ,2j 

Generalizing  from  tne  example  computation  also 
requires  some  means  of  noting  when  the  array  bounds  nave 
been  reacned  for  this  example.  In  other  words,  conditions 
have  to  be  interposed  between  some  instructions  wnere  a 


cnange  in 

tne 

flow 

of  control 

for  tne 

aigori tnm 

is 

necessa  ry . 

An 

Input 

trace  is 

defined  as 

a  sequence 

of 

Instructions  and  conditions  wnicn  describes  tne  example 
computation.  In  tne  matrix  multiply  example  this  might  be 
accomplished  tnusly: 
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ctl.lj 
c  Ci*  U 


c  [1 , 1 j  =  a 

Cll.l]  +  A  [  1 » 1 J 
C  [1 , 1 J  +  A[l,2j 


* 


BU.lJ 

B[2,1J 


COND  -  col  index  of  A 


col  size  of  A 


CU.2J  =  2 

C  [  1 , 2]  =  C  [1.2J  +  A  [1 ,  lj  *  B  fl  ,2J 
CU.2J  =  C  [l ,  2  ]  ♦  A  [1  »  2  J  *  B  [2 ,2J 


COND  -  coi  index  of  A 


col  size  of  A 


C[2,2J  =  C  [2 , 2  J  +  A [2 , 2J  *  B[2,2j 

COND  -  row  4  col  index  of  C  =  Dimension  of  C 

STOP 

The  program  synthesizer  used  for  this  thesis  is 
designed  around  concepts  and  ideas  on  syntnesi zing  a  program 
given  example  traces  as  described  in  reference  Li?]. 
Previous  researcn,  references  [16J  ,  [17J  ,  and  [18 J ,  seems  to 
indicate  that  correct  programs  can  be  synthesized  on  the 
basis  of  relatively  few  sample  computations,  Dut  tnat  tne 
amount  of  time  required  to  do  tne  syntnesis  grows  very 
quicicly  as  a  function  of  program  complexity. 

2.  Trace  Coding 

Tne  syntnesis  procedure  is  domain  independent;  that 
is,  the  Input  trace  can  be  coded  into  any  consistent 
representation,  and  It  will  not  affect  the  operation  of  the 
synthesizer.  Since  tne  syntnesis  procedure  is  independent  of 
the  input  trace  representation,  alphanumeric  characters  will 
be  used  to  represent  instructions  and  conditions.  They  are 
distinguished  from  each  other  by  their  position  within  tne 
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trace  ratner  tnan  by  tnelr  symbolic  representation.  Fcr 
example,  an  'a'  mlent  represent  an  instruction  or  a 
condition,  tfitnin  tne  instruction  set  itself,  identical 
instructions  are  encoded  as  identical  symbols,  A  simple 
trace  of  a  routine  to  find  ail  positive  numbers  in  an  input 
stream  mignt  be: 

A  =  0 
READ  2 

CDND  -  B  is  negative 

A  =  A  +  1 
READ  B 

COND  -  B  is  negative 

A  =  A  +  1 
READ  E 

COND  -  B  is  positive 

PRINT  B 

• 

• 

If  tne  instruction  A=A+1  is  represented  by  a  eacn 

occurrence  of  tnat  instruction  in  tne  trace  will  nave  to  be 
represented  by  a  'b'.  Ttie  reason  for  tnis  constraint  is 
obvious.  Since  tne  syntdesizer  only  receives  a  tra^e  of  tHe 
example  execution,  it  cannot  determine  wnetner  A=A+1  is  tne 
same  instruction  being  encountered  repeatedly  in  a  loop,  as 
it  is  in  tnis  example,  or  waetner  tnere  are  several 
independent  occurrences  of  A=A+1.  Figure  4  is  an  example  of 
a  typical  coded  input  trace.  Tne  left-nand  column  entries 
are  conditions  and  tne  rignt-nand  column  entries  are 
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instructions.  Figure  4  is  read  as  state  's  transistions  on 
condition  'x'  to  state  'a'  wnicn  in  turn  transitions  on  'x' 
to  state  ' o ',  and  so  fortn. 


transitions  states 


Figure  4.  Input  Trace 


3.  Input/Output  Trace  Representation 

A  Moore-type  representation,  as  defined  in  [1VJ  ,  can 
be  used  to  hi*ftli*at  certain  features  that  must  be  dealt 
vita  wnen  producing  an  algoritnm  from  an  example  trace. 
Throughout  the  rest  of  the  discussion,  Moore  machines  and 
algorithms  will  be  used  synonymously.  Conditions  relate  to 
transitions  and  instructions  relate  to  states  of  the 
machine.  In  fact,  tne  function  of  tne  syntnesizer  can  be 


viewed  as  that  of  determining  a  minimum-state  deterministic 
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Moore  machine  equivalent  of  a  non-determinis ti c  Moore 
machine.  Representing  input  traces  as  Moore  macninps  will 
often  snow  tne  non-deterministic  structure  of  tne  example 
trace.  Tnis  non-determinism  must  be  resolved  by  tne 
syntnesizer  in  order  for  an  algorithm  to  be  generated. 
Figure  5  is  tne  Moore  machine  representation  of  tne  inout 
trace  of  Figure  4.  Notice  that  at  node  'b',  tne  trace  is 
non-deterministic.  Transition  'y'  leads  from  node  'b'  to  two 
different  nodes?  similarly,  transition  'x'  leads  from  node 
'b'  to  two  separate  nodes.  Figure  6  is  tne  deterministic 
Moore  machine  which  has  been  constructed  by  our  synthesizer 
based  upon  tne  input  trace  given  in  Figure  4.  The 
non-determinism  has  been  resolved  by  splitting  state  'a' 
into  two  states  distinguished  from  eacn  other  by  an  integer 
prefli  label.  The  assignment  of  the  prefix  label  is  the 
mechanism  used  by  tne  synthesizer  to  prevent 
non-determinism.  In  order  to  accomplish  this  assignment,  the 
syntnesizer  uses  an  enumeration  tecnnique.  Eacn  instruction 
is  assigned  a  prefix  label  in  a  manner  that  maintains 
determinism  and  assures  that  the  algorithm  will  correctly 
execute  the  input  trace.  It  is  easy  to  verify  that  tne 
deterministic  Moore  machine  of  Figure  6  will  execute  tne 
trace. 
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c.  synthesis  procedure 


1 .  Function 

Tne  function  of  tne  syntnesizer  program  is  to 
provide  a  minimum-stats,  correct  program  consistent  witn  tee 
input  trace  of  tne  example  computation.  Tne  syr.tneeis 
orocess  will  ee  completed  wnen  it  is  determined  wri^r. 
occurrence  of  a  iaoeilsd  instruction  correspond?  to  eac.n 
particular  instruction  in  tne  input  trace.  Ir.  order  to 
accomplisn  tnis  goal,  tne  syntnesizer  is  basically 
structured  as  a  deptn-first  searen  aigoritr.m.  Backup  ari 
fixup  mechanisms  exist  to  enftance  tne  searen  procedure  wnen 
pruning  nas  not  tept  tne  algorithm  from  traversing  a 
fruitless  nranen  of  tne  searen  tree.  Tne  spar~n  mecr.anisn 
attempts  to  assign  a  lacel  to  eacn  instruction  in  eucn  a 
manner  tnat  tne  generated  algorithm  remains  technically 
correct;  tnat  is,  nondeterminism  is  not  allowed  to  °xist  and 
tne  orielnal  trace  can  still  te  executed.  \  numDer  rf 
teenniaues  exist  within  tne  syntnesizer  wr.i^n  ail  pruning  of 
tne  searen  tree,  and  tnerecy  matte  it  possible  to  *vntne?i:e 
more  complicated  programs  in  a  reasonable  amount  of  time 
tnan  could  otnerwise  oe  expected  from  a  general  enumeration 
tecnnlque.  Tnese  techniques  offset  tne  major  disadvantage  of 
exponential  erowtn  of  tne  searen  space  as  a  function  of 
input  wnicn  is  found  in  a  general  enumera  tive  searen 
tecnnlque. 


2.  Concepts 

Certain  definitions  and  concepts  must  ne  presented 
before  the  actual  algorithm  is  discussed.  In  order  to 
facilitate  tne  discussion,  it  is  necessary  tc  refer  tc 
Fieure  7.  Sacn  level  in  tne  figure  consists  of  an 
instruction- condition- instruction  tri ole  .  referred  to  as  an 
I-C-I.  in  Figure  7  tne  leftmost  s yrroi  under  I-C-I  is 
referred  to  as  tne  leading  instruction  cf  tne  triple,  tne 
middle  symbol  is  tne  condition,  and  tne  rightmost  svm.toi  is 
tne  trailing  instruction.  Tne  trailing  instruction  at  level 
i  becomes  the  leading  instruction  at  level  i  +  1.  So  this 
input  trace  represents  tne  instruction-condition  sequence  's 
r  a  n  s  r  a  . . . ' . 

level  I-C-I 

1  sra 

2  ans 

i  sra 

4  ala 

5  a  xa 

5  ay  a 

7  axa 

9  anr 

Figure  7.  Instruction-Condi ti on-I nstruction  Triple 

Two  levels  i  and  j  are  said  to  belong  to  tne  same 
couple-class  if  tne  elements  of  tne  level  are  tne  same. 
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Instruction  elements  of  tne  trace  wnicn  are  in  tne  same 
couple-class  may  De  assigned  tne  same  prefix  laoel  auric.? 
syntnesls  if  tne  assignment  does  not  cause  non-determinism. 
For  example,  ?iven  tne  trace  in  Figure  7,  levels  1  and  3  are 
in  tne  same  couple-class,  as  are  levels  5  and  7.  Difference 
set  relations  are  anotner  situation  tnat  can  exist  wrier,  is 
of  interest.  Tne  first  two  elements  of  level  i  and  level  J 
are  tne  same,  but  tne  tniri  element  is  not  tne  same.  A 
difference  set  relation  indicates  tnat  tne  leading 
instructions  cannot  be  represented  by  tne  same  state 
regardless  of  tne  prefix  laoel  assigned  during  syntnesis 
because  tne  leadine  instruction  nas  tne  same  transition  to 
two  different  trailing  instructions.  Again  usin?  tne  above 
trace,  level  2  and  level  8  fall  into  tnis  category.  In  tms 
situation,  tne  index  B  would  be  entered  into  tne  difference 
set  for  level  2.  By  implication,  tne  index  2  is  also  in  tr.e 
difference  set  for  level  8,  altnou?n,  in  practice,  it  is  not 
entered . 

Once  tne  initial  couple-class  information  and 
difference  set  information  nave  been  determined,  additional 
difference  set  information  can  be  obtained  tnrougn  tne 
chaining  nature  of  differencing.  For  example,  suppose  tne 
trace  consists  of  tne  one  snown  In  Figure  B .  Tnen  tne  Moore 
macnine  representation  of  tnis  trace  is  snown  in  Figure  y. 
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index 


trace 


t>  axa 

6  axa 

7  ays 


u 

y 

10 

Fieure  B.  Cnainin*  of 


axa 
axa 
ay  t 

Difference  Set  Relations 


Figure  9.  Non-deterministic  Input  Trace 

Tnis  macnine  is  obviously  nondetermlnistic  since 
state  'a'  transitions  by  'y'  to  two  different  states. 
Difference  set  resolution  requires  tnat  tne  index  for  'ayt' 
be  in  tne  difference  set  of  'ays'.  Since  tnat  requirement 
causes  different  states  to  represent  tne  'a'  in  'ayt'  and  in 
'ays',  and  furtner  since  tne  trailing  'a'  in  tne  preceding 
level  Is  exactly  tne  same  instruction,  tne  preceding  levels 
now  satisfy  tne  difference  set  relation.  Tne  leading 
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instruction  and  tne  condition  are  tne  same,  nut  tne  trailing 
instruction  in  tne  I-C-I  triple  is  different  since  t.ney  nave 
previously  Seen  assigned  to  a  difference  set  relation. 
Tnerefore,  tne  leal  instruction  must  be  labelled  witn  a 
different  prefix  during  assignment  and  similarly,  tne  levels 
above  tnem.  So  tbe  Moore  macfline  will  now  be  deterministic 
and  in  tne  following  form. 


Figure  145.  Deterministic  Trace 

Given  a  partial  trace  derived  from  tne  example 
execution,  mere  are  numerous  Moore  macnines  tnat  ''an  be 
constructed  to  satisfy  tne  trace.  At  one  end  of  tne 
spectrum,  a  program  can  be  constructed  sucn  tnat  earn 
succeeding  state  is  assigned  a  different  prefix  label.  Tnis 
metnod  always  results  in  a  straier.t-iine  program.  Eacn 
instruction  nas  one  transition  entering  it  and  one 
transition  exiting  from  it.  Allowing  tnis  metnod  produces 
tne  maximum  size  program  consistent  witn  tne  input  trace. 
See  Figure  11.  Tnis  is  not  a  particularly  desirable  metnod 
since  It  does  not  recognize  loop  structures  tnat  can 
significantly  reduce  tne  size  of  tne  program.  Additionally , 
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it  hides  the  basic  structure  of  the  algorithm.  The  major 
advantage,  of  course,  is  tnat  absolutely  no  search  is 
required  to  produce  a  deterministic  machine. 


condition 

z 

X 

X 

Figure  11a 


ins  true  tion 
a 
a 
a 
a 

.  Trace 
Figure  11. 


Figure  lib. Program 
S traignt-ii ne  program 


On  the  otner  end  of  the  spectrum,  a  program  can  be 
constructed  suen  tnat  eacn  identical  instruction  receives 
the  same  prefix  label.  This  method  tases  full  advantage  of 
loop  structures,  and  will  result  in  a  minimum  state  machine. 
However,  such  a  metnod  will  seldom  produce  a  deterministic 
machine?  tnerefore,  it  will  not  produce  a  satisfactory 
algorithm.  See  Figure  12. 


level  cond  1 nstr 

1  a 

2  x  a 

3  i  a 

4  x  a 

5  y  a 

6  y  b 

Figure  12a.  Trace 


Figure  12.  Minimum  State  Machine 


49 


Tie  best  solution  lies  somewnere  between  these 
endpoints.  A  reasonable  first  euess  at  the  number  of  states 
required  to  produce  a  deterministic  machine  witnin  tnis 
spectrum  can  be  made  by  es  ta  bl  isni  r..<?  a  lower  bound  on  tne 
number  of  states.  Tne  cardinality  of  tne  instruction  set  is 
defined  as  tne  number  of  different  instructions  appearing  in 
tte  trace.  Using  tne  above  figure  as  an  example,  it  can  be 
determined  tnat  tne  cardinality  of  tne  instruction  set  is 
two*  tnat  is,  tnere  are  two  different  instructions,  'a'  and 
'b',  in  tne  trace.  Tnis  measure  provides  an  absolute  lower 
bound  on  tne  number  of  states  required  in  tne  final  machine. 
Tnis  lower  bound  can  be  refined  by  determining  a  lower  bound 
on  the  number  of  states  needed  for  eacn  individual 
instruction.  Once  again,  using  tne  above  figure  as  an 
example  illustrates  tnis  concept.  Tne  instruction  'a'  at 
level  5  must  be  different  tnan  the  instructions  at  levels  1 
tnrougn  4  because  of  difference  set  resolution,  or  else 
nondeterminism  results  on  tne  transition  'y  •  Therefore,  in 
order  to  maintain  determinism,  tne  instruction  'a'  must  re 
allowed  at  least  two  states.  Summation  of  tne  lower  councs 
for  eacn  of  tne  instructions  gives  a  lower  bound  on  tr.e 
total  number  of  states  required  for  tne  macnir.e.  For  tnis 
particular  example,  tne  program  would  be  eenerated  as: 


50 


X 


Figure  13.  Instruction  Set  Lower  Bounds 

If  tne  searcn  space  is  viewed  as  a  tree  structure 
tflen  tne  levels  of  tne  tree  can  be  associated  wi tn  tne 
Instructions  by  assigning  tne  first  instruction  in  tne  input 
trace  to  tne  first  level,  tne  second  instruction  to  tne 
second  level,  and  so  fortn.  Tne  crancning  factor  at  eacn 
level  is  tne  state  lower  bound  computed  for  tne  instruction 
seen  at  tnat  level.  Tne  prefix  label  assigned  to  tne 
instruction  is  represented  by  tne  specific  orancn  usee  to 
traverse  to  tne  next  level. 

Tne  ilea  of  providing  a  lower  bound  on  tne  number  of 
states  leads  to  an  iteratively  expanding  deutn-rirs t  sear^r. 
tfnen  all  possible  combinations  of  prefix  labels  nave  been 
tried,  but  tne  algoritnm  remains  non-deternrinistic ,  tr.e 
lower  bound  is  incremented  and  tne  searcn  is  restarted  from 
tne  top  level.  Wnen  tne  lower  Dound  is  increased,  tne  searcn 
tree  obtains  additional  patns  to  tne  final  solution  ty 
increasing  tne  branening  factor  associated  with  one  or  more 
instructions.  Tne  deptr  of  a  successful  searcn  into  tne  tree 
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is  restricted  by  tne  Lover  bound  on  tne  number  of  nodes 
required  by  tne  deterministic  macnine.  Only  wner.  a  pattern 
of  prefix  assignments  nas  been  made  wnich  allows  tr.e 
algoritnm  to  remain  deterministic  and  all  of  tne 
instructions  in  tne  original  trace  nave  Deen  assigned  prefix 
lapels  will  tne  syntnesis  terminate.  Tnis  mecnanism  prevents 
a  straignt-line  model  from  Peine  output  as  tne  algorithm 
unless  it  is  tne  only  one  tnat  can  satisfy  tne  input  trace. 
More  importantly,  it  provides  tne  minimum-state 
deterministic  macnine  capable  of  executing  tne  input  trace. 

D.  SYNTHESIZER  STRUCTURE 

Tne  syntnesis  program  is  subdivided  into  two  primary 
modules:  static  processing  of  tne  input  trace;  and  dyr.ami c 
processing  of  tne  information  extracted  from  the  input  trace 
by  tne  preprocessing,  or  static  processing  pnase.  Static 
processing  provides  information  sucn  as  couple-classes, 
difference  sets,  and  lower  bounds  on  tne  number  of  macnine 
states.  Dynamic  processing  uses  Knowledge  inherited  from 
preprocessing  to  guile  tne  search  mecnanism  to  a  final 
output  of  the  algorithm.  These  two  modules  will  be  discussed 
in  turn,  and  tne  primary  mechanisms  involved  will  be 
amplified. 

1 .  Static  Processing 

Static  processing  can  be  conceptualized  as 
consisting  of  tnree  main  functions:  (a)  accept  tne  input 
trace?  (b)  preprocess  tne  trace  for  difference  sets. 
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couple-classes,  ana  state  bounds;  ana  (c)  prepare  a  trace 
table  for  f urtner  use  by  dynamic  processing.  Cnee  tni s 
preprocessing  nas  been  accomplisned ,  tne  static  module  is  no 
longer  necessary  to  tne  syntnesizer. 

In  tne  current  configuration,  tne  static  rroiule 
expects  to  find  tne  input  as  a  sequence  of 
instruction-condition-instruction  triples.  Figure  14  is  an 
example  of  an  input  trace. 

level  trace 

1  anp 

ps  a 

3  aga 

4  ay  r 

b  rs  r 

6  rsr 

7  rra 

8  aea 

9  ay  t 

Figure  14.  Typical  Input  to  Static  Processor 
Eacn  line  consists  of  a  triple,  for  example  'anp'. 
Tne  'a'  represents  an  Instruction,  tne  'n'  represents  tne 
condition  wnicn  causes  tne  program  trace  to  transition  to 
tne  next  instruction  'p'.  For  eacn  level,  tne  first  element 
represents  tne  same  Instruction  as  tne  last  element  of  tne 
preceding  level.  Tnis  is  easier  to  see  if  tne  above  trace  is 
represented  as  a  Moore  macnine  in  wnicn  tne  nodes  are 
instructions  and  tne  conditions  are  transitions.  State  'a' 
transitions  on  condition  'n '  to  state  'p'  wnicn  transitions 
on  condition  's'  to  state  'a'  wnicn  transitions  on  condition 
'»'  bacic  to  state  'a',  etc. 
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Kach  occurrence  of  an  instruction  symbol  in  tne  input  tra~e 
is  represented  Dy  tne  same  state  at  tnis  point  in  tne 
synthesis. 

Once  tne  input  trace  nas  been  accepted,  static 
processing  can  begin.  Static  processing  consists  of 
determining  tne  level  indices  associated  with  each 
couple-class  and  with  each  difference  set.  For  tne  trace  of 
Figure  15,  tnese  are  shown  in  Figure  15. 

Tnere  are  two  couple-classes  in  tnis  trace.  Tnev  are 
[agaj  at  levels  3  and  8,  and  IrsrJ  at  levels  5  and  5.  The 
remaining  levels  are  not  assigned  to  a  coupie-ciass  because 
no  other  levels  match  with  tnem.  Couple-class  information  is 
useful  to  the  dynamic  processor  for  determining  forced 
assignments  and  dynamic  non-equivalence.  These  ideas  will  be 
discussed  more  fully  in  tne  section  on  dynamic  processing. 

Difference  sets  exist  for  levels  3  and  4.  Level  4 
nas  a  difference  set  wnicn  contains  tne  index  y;  that  is, 
tne  element  at  level  4,  'ayt',  must  nave  a  different  prefix 
label  on  'a'  tnan  tne  element  at  level  y,  'ayt'.  If  tne  'a' 
is  not  labelled  differently  during  tne  syntnesis, 
nondeterminism  will  result  since  the  same  transition  would 
lead  to  different  nodes. 

Difference  set  resolution  is  a  very  powerful 
mechanism  for  ensuring  deterministic  benavior  of  the 
algorithm.  A  considerable  amount  of  the  prefix  label 
assignments  to  the  nodes  can  be  resolved  usin*  difference 
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sets.  Notice  tnat  level  8  appears  in  me  difference  set  for 
level  3  even  thougn  levels  3  and  e  are  in  the  sa-e 
couple-class.  *t  first  tnis  appears  contradictory  since 
equivalent  couple-class  names  imply  tnat  tne  elements  are 
tne  same,  but  difference  set  existence  forces  tne  lead 
instructions  to  be  different.  Tnis  points  out  tne  relative 
power  of  couple-class  information  and  difference  set 
information.  Difference  set  information  Is  immutable. 
Couple-class  information  only  dints  at  equivalence.  In  tnis 
particular  example,  tne  entry  at  level  3  was  caused  by  tne 
cnainine  effect  of  difference  set  resolution.  Notice  tnat 
since  tne  'a'  at  level  4  must  ne  different  tnan  tne  'a'  at 
level  9,  and  notice  t&at  since  tne  trailing  'a'  at  level  3 
is,  by  definition,  tne  same  as  tne  leading  'a'  at  level  4, 
tne  trailing  'a'  at  level  3  cannot  be  tne  same  as  tne 
trailing  'a'  at  level  8;  tnerefore,  levels  3  and  6  cannot  be 
in  tne  same  couple-class. 

To  compute  tne  lower  bound  on  tne  number  of  states 
In  tne  algoritnm,  tne  minimum  number  of  states  needed  for 
eacn  instruction  is  summed.  For  tnis  same  example,  tne 
Instruction  set  consists  of  {a,p,r,t}.  Tne  bounds  for  p,r, 
and  t  are  eacn  1.  Tne  bound  for  'a'  is  2.  Tnere  must  be  at 
least  two  different  occurrences  of  'a'  from  tne  difference 
set  resolution.  Tnerefore,  tne  minimum  number  of  states  witn 
which  a  deterministic  Poore  machine  can  be  constructed  for 
this  trace  is  5. 
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Finally,  static  processing  passes  all  tne 
inf  o  rmation  concerning  tne  input  trace  to  tne  dynamic 


processor  via  a  trace  table  in  tne  following  torm.  Eacn 
level  nas  only  one  associated  condition  and  one  associated 
instruction.  Since  difference  set  information  is  associated 
with  tne  lead  instruction  in  an 
instruction-condition-instruction  sequence,  it  is  entered  at 
that  level.  Since  couple-class  information  is  associated 
wltn  tne  entire  instruction-condition-instruction  sequence, 
it  is  associated  with  the  trailing  condition-instruction 
pair . 
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Figure  17.  TraceTaoie 

2.  Dynamic  Processing 

Dynamic  processing  involves  assigning  prefix  labels 
to  tne  states  of  tne  macnlne.  In  tnls  way,  separate 
occurrences  of  tne  same  instruction  are  differentiated.  Tne 
dynamic  processor  is  tne  search  mechanism  for  the 
syntneslzer.  It  operates  in  sucn  a  way  tnat,  at  any  point  in 
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the  synthesis,  the  portion  or  the  trace  previously  processed 
represents  a  deterministic  Poore  maenine.  In  order  to 
maintain  the  determinism,  dynamic  processing  steps  tnrouen 
tnree  pnases:(l)  assignment  of  tne  prefix  iaeei  to  toe 
instruction;  (2)  difference  set  resolution,  and  (3^  dynamic 
equivalence  assurance.  Additionally,  eacn  of  these  pnases 
nave  built  in  fixup  and  bacitup  conditions  associated  wltn 
them.  Tne  f  ixup/Dacicup  conditions  encountered  during 
difference  set  resolution  or  during  dynamic  equivalence 
checking  are  indicators  mat,  if  tne  current  assignments 
remain  tne  same,  a  nondetermlnisn  will  occur  in  future 
assignments.  As  sucn,  tney  inform  the  pruning  mecnanisms  of 
tne  searca  algorithm. 

An  integral  part  of  the  dynamic  processor  is  tne 
failure  memory.  It  controls  tne  searcn.  Tne  failure  memory 
may  be  conceptualized  as  a  L  x  P  matrix  wnere  L  is  tne  row 
size  and  corresponds  to  tne  number  of  levels  in  tne  trace. 
Eacn  row  nas  P  columns  wnere  P  is  equal  to  tne  lower  bound 
assigned  to  tne  instruction  contained  on  that  level  of  the 
trace.  An  entry  Into  tne  failure  memory  at  some  level  i  and 
some  column  J,  where  1  <=  i  <=  L  and  1  <=  j  <=  P,  prevents 
the  assignment  of  J  as  a  prefix  label  for  tne  Instruction  at 
level  i.  When  a  failure  memory  ceil  contains  an  entry  it  is 
called  a  valid  ceil?  otnerwise  it  is  invalid .  Eacn  ceil  of 
tne  failure  memory  Is  a  two-element  entry.  Tne  structure 
factor  Is  the  first  element.  It  indicates  wnicn  level  of  the 
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trace  caused  tne  entry.  Tne  free  state  factor  is  tne  second 
element.  As  tne  name  indicates,  tnis  element  is  a  function 
of  tne  number  of  free  states  available  at  tne  time  of 
assignment.  Tne  specifics  of  tne  failure  memory  operation 
and  tne  nature  of  failure  memory  entries  will  be  discussed 
throughout  tne  rest  of  the  section  as  each  phase  of  the 
dynamic  processor  is  discussed, 
a.  Label  Assignment 

As  previously  mentioned,  iacei  assignment  is  tne 
first  function  provided  by  tne  dynamic  processor.  A  label 
assignment  can  oe  eitner  forced  or  arbi trary .  Additionally, 
the  assignment  can  result  In  the  creation  of  a  new  state,  a 
label-name  combination  not  seen  before.  A  forced  assignment 
occurs  when  the  instruction  at  tne  current  wonting  level  is 
a  member  of  tne  same  couple-class  as  an  instruction  at  a 
prior  level,  and  tne  lead  instruction  into  botn  of  those 
levels  has  tae  same  label  assignment.  Tne  current  woricir.? 
level  is  defined  as  tne  level  of  tne  trace  wnlcn  contains 
the  most  recently  asslenel  prefix  label,  but  difference  set 
resolution  and  dynamic  equivalence  cnecicing  nave  not  been 
completed  at  that  level.  An  example  is  *iven  in  tne  trace 
shown  in  Figure  18. 

Tne  label  at  level  7  is  forced  by  tne  laoei 
assignments  at  levels  4  and  5.  Notice  that  the  instructions 
at  level  5  anl  at  level  7  are  in  tne  same  couple-class. 
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Figure  IB.  Partial  Trace  labelling 

and  that  tne  instructions  at  levels  4  and  6  nave  tne  same 
prefix  label.  Tnis  condition  forces  tne  instruction  at  level 
7  to  nave  tne  same  prefix  label  as  tne  instruction  at  level 
b.  The  Moore  machine  representation  of  tne  partial  trace  is 
snown  in  Figure  19.  Tne  assignment  at  level  B  is  also  forced 
for  similar  reasons.  By  definition,  any  forced  assignment 
involves  previously  assigned  states,  label-instruction 
combinations,  tnat  nave  been  seen  before;  therefore,  no 
forced  assignment  can  result  in  a  new  state. 


Figure  19.  Partially  Determined  Moore  Macfline 


Tne  failure  memory  can  oe  used  in  conjunction 
witn  forced  assignments  to  signal  a  Dacicup  condition  to  tre 
searcn.  If  tne  failure  memory  entry  corresponding  to  me 
label  assignment  at  tne  current  wonting  level  is  valid,  tnen 
a  contradiction  results  from  tne  forced  assignment.  Suppose 
that  the  trace  table  and  failure  memory  are  as  snowr.  in 
Figure  20,  and  tne  forced  assignment  at  level  8  nas  just 
been  made.  Tne  entry  '1.1'  at  row  2,  column  a  of  tne  failure 
memory  is  interpreted  in  tne  following  manner.  Tne  integer 
to  tne  left  of  tne  decimal  indicates  tnat  tne  entry  was 
caused  by  the  current  assignment  at  level  1.  The  'l'  to  tne 
rignt  of  tne  decimal  point  is  tne  number  of  free  states  +  l 
available  wnen  tne  assignment  at  level  1  caused  tne  failure 
memory  entry;  tnerefore,  wnen  tne  entry  was  made  there  were 
no  free  states  available.  A  free  state  is  one  wnicn  nas  not 
been  bound  to  a  particular  instruction. 

Tne  assignment  at  level  9  is  forced.  In  other 
words  the  sequence  of  the  previous  assignments  causes  tne 
prefix  laoel  of  tne  Instruction  at  level  8  to  De  a  2. 
However,  the  failure  memory  contains  an  entry  at  row  9 
column  2,  F!*(9,2).  This  entry  indicates  tnat  tne  instruction 
at  level  e  cannot  be  assigned  tne  label  '2',  for  if  it  were 
to  be  assigned  a  '2',  a  nondeterminism  will  result.  To 
resolve  tne  conflict,  bacicup  is  initiated  until  tne  last 
unforced  assignment  is  found.  In  tnis  case,  tne  backup  is  to 


level  6 


Tne  assignment  at  level  6  will  oe  changed  and  me  searcn 
will  continue  from  mere. 
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Figure  2k).  Trace  Tanle/Faiiure  Memory  Configuration 
for  a  Forced  Assignment 

If  tne  assignment  is  not  forced,  tne  failure 
memory  row  corresponding  to  tne  current  wonting  level  is 
searcned  for  tne  first  occurrence  of  an  invalid  cell.  An 
invalid  cell  is  one  wnicn  does  not  contain  a  failure  memory 
entry.  If  a  cell  is  invalid,  tne  assignment  of  a  prefix 
label  corresponding  to  the  failure  memory  column  index  for 
that  cell  is  possible  on  mat  level  of  tne  trace.  The  column 
number  of  tne  first  invalid  ceil  becomes  tne  label 
assignment  for  tne  instruction  at  that  level.  For  example, 
suppose  level  5  is  tne  current  wonting  level  and  tne  trace 
table  and  failure  memory  nave  the  configuration  snown  in 
Figure  21. 


Trace  Table 


Failure  Memory 


level  cond  1  nstr  _1  2  4 

5  r  a  l.l  4.1 

Figure  21.  Trace  Table  Entry  Snowing 

Arbitrary  Assignment  tfetnod 

Tne  first  invalid  entry  in  tne  failure  memory  on 
row  6  is  in  column  a;  tnerefore,  instruction  'a'  for  level  6 
will  be  assigned  a  prefix  label  of  3.  Tnese  non-forced 
assignments  may  result  in  tne  creation  of  a  new  state;  that 
is,  a  label-instruction  pair  not  previously  assigned  during 
tne  synthesis.  If,  at  some  future  point  in  tne  searcn,  a 
backup  is  initiated  tnat  reacnes  tnis  level  of  tne  trace, 
tne  backup  mecnanlsm  will  not  stop  to  perform  a  retry.  At 
any  point  In  the  synthesis,  all  previous  levels  have 
received  assignments  based  on  the  constraint  that  tne 
minimum  number  of  states  nas  been  used  consistent  with 
maintaining  determinism;  tnerefore,  assigning  a  different 
prefix  label  to  a  state  wnicn  has  been  defined  as  a  new 
state  only  changes  tne  name  of  tne  state,  and  does  not 
change  the  structure  of  tne  algorithm.  Since  tne  structure 
of  tne  algorithm  nas  not  been  cnanged,  tne  cause  of  the 
nondetermlnl sm  is  still  present. 

One  other  type  of  assignment  should  be  mentioned 
at  tnis  point.  Pseudo-assignment  occurs  wnen  tnere  is  only 


one  invalid  cell  left  in  a  failure  memory  row  at  a  level 
otner  tnan  tne  current  wortcing  level  and  tnere  are  no  free 
states  available.  Although  pseudo-assignment  does  not 
immediately  cause  a  label  to  De  assigned  to  tne  instruction 
at  tnat  level,  it  does  simulate  a  looK-anead  mecnanism  for 
tne  searcn  tecnnique  by  triggering  difference  set  resolution 
and  dynamic  eauivalence  checKing  as  if  tnat  level  of  tne 
trace  were  assigned  a  value.  Since  tne  pseudo  value  is  tne 
only  value  currently  possible  for  tnat  level,  if  a  bacKup  or 
fiiup  condition  is  encountered  during  pseudo  assignment,  tne 
assignment  mecnanism  can  immediately  try  another  label  at 
tne  current  wording  level#  tnereby  saving  tne  unnecessary 
search  of  a  patn  which  it  already  Knows  to  be  nonproductive. 

Once  a  tentative  label  assignment  nas  been  made 
to  tne  instruction  at  the  current  wording  level,  difference 
set  resolution  and  dynamic  equivalence  cnectlng  can  be 
performed.  Althougn  these  actions  may  cause  a  fixup  on  tne 
prefix  label  at  tne  current  wonting  level,  tneir  primary- 
purpose  is  to  furnisn  information  to  the  failure  memory  that 
will  nelp  guide  future  label  assignments, 
b.  Difference  Set  Resolution 

Difference  set  resolution  prevents  future 
assignments  being  made  that  are  Known  to  cause 
nondeterminism  if  tne  current  assignments  remain  uncnanged. 
Difference  sets  outline  a  significant  portion  of  tbe 
structure  of  the  input  trace  without  regard  to  label 
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assigned  to  tne  instruction  at  tne  level  from  vnicn  tr.e 
difference  set  is  being  resolved  if  tde  cell  has  not  already 
been  made  valid  tnrough  a  previous  assignment.  For  example, 
if  tne  prefix  assignment  at  level  1  is  a  '1',  tne  failure 
memory  entries  are  made  in  column  l  at  levels  3,5,15,1?. 


Similarly,  wnen  tne  assignment  '1'  is  made  at  level  2, 
failure  entries  are  male  at  levels  4  and  11.  Now  wren  tne 
assignment  at  level  3  is  made,  tne  dynamic  processor  will 
not  try  to  assign  a  prefix  value  of  'l'  since  tne  failure 
memory  cell  at  (3,1)  is  valid.  Tne  assignment  will 
automatically  be  '2'.  Notice  taat  at  level  5  tne  previous 
assignments  nave  caused  tne  prefix  label  to  be  a  '3'.  In 
otner  words,  tne  failure  memory  nas  caused  tne  searcn  tree 
to  be  pruned  so  that  an  assignment  of  ' l'  or  ' 2 '  will  not  ce 
tried.  Eitner  one  of  tnese  assignments  would  nave  resulted 
in  nondeterminism  being  Introduced  into  tne  trace  at  level 
6. 


Figure  24a.  Prefix  Label  Equals  1 


Fleurs  24b.  Prefix  Label  Equals  2 

Figure  24.  Nondeterminlstic  Prefix  Label  Assignments 

While  failure  memory  entries  are  being  mane 
under  difference  set  resolution,  it  Is  possible  for  a  row  to 


nave  all 

cells  valid  except 

one . 

Tnis  nas  been  previou 

defined  as 

a  situation  leading1 

to 

pseudo-assignment.  T 

situation  nas  occurred  at  level  11  in  tne  example  given  in 
Fieure  23.  When  sucn  an  occurrence  happens  a  loofr-anead 
mechanism  is  triggered  to  resolve  tne  difference  set  at  tnat 
level.  In  tnls  example,  tne  failure  memory  cell  at  (21,3) 
nas  been  validated  with  an  entry  which  indicates  the  current 
wording  level  as  level  4  wnen  tne  pseudo-assignment  occurred 
at  level  11.  Another  situation  which  can  occur  in  a  failure 
memory  row  is  when  all  the  entries  in  the  row  become  valid. 
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This  condition  Is  called  an  incipient  fence,  rfhen  an 
incipient  fence  exists  and  tnere  are  no  free  states 
available,  tnen  no  assignment  can  be  made  at  tnat  level. 
This  condition  is  called  a  fence . 

Since  the  search  mecnanism  always  mows  tne 
level  from  which  it  is  doing  loot-ar.ead  by  difference  set 
resolution,  it  is  able  to  perform  a  fixup  on  tne  lapel 
assignment  at  tne  earliest  possible  time.  A  fixup  is 
accomplished  by  incrementing  tne  prefix  laDei  bv  one.  If  an 
entire  row  in  the  failure  memory  becomes  valid  and  tnere  are 
no  free  states  available  a  fixup  must  be  performed  on  tne 
label  assignment  at  tne  current  wortcine  level.  If  the  label 
is  left  the  same,  then  wnen  the  search  reaches  the  fenced 
level,  no  assi?nment  will  be  possible.  Each  time  a  fixup 
occurs,  all  entries  made  in  the  failure  memory  as  a  result 
of  the  previous  label  assignment  are  deleted,  and  entries 
are  then  made  based  on  tne  new  label, 
c.  Dynamic  Equivalence 

Couple-class  information  furnished  by  static 
processing  ails  in  tne  determination  of  dynamic 
nonequivalence.  Dynamic  nonequi valence  can  occur  during  the 
syntnesis  at  any  level  below  tne  current  wording  level  wnen 
tne  couple-classes  are  equal.  Dynamic  equivalence  results 
wnen  instructions  in  tne  same  couple-class  nave  bpen 
assigned  the  same  prefix  label.  Consider  Fieure  25.  The 
I-C-I  triples  at  levels  5  and  6  and  at  levels  11  and  12  are 
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tasra] »  therefore,  tney  are  in  the  same  couple-class.  The 
instruction  ' a '  at  level  5  nas  been  assigned  a  prefix  cf 
'l ' ,  and  tne  instruction  'a'  at  level  5  nas  Deen  assigned  a 
prefix  of  'l'.  Now,  if  tne  instruction  at  level  11  is 
assigned  a  prefix  of  ' 2 '  and  tne  instruction  at  level  12  is 
assigned  a  prefix  of  'l',  dynamic  equivalence  will  c^cur. 
Further,  tne  assignment  at  level  12  will  be  forced.  Dynamic 
non-equivalence  results  when  such  an  assignment  scheme 
causes  non-determinism.  Dynamic  equivalence  cr.ecicing 
functions  as  a  looic-anead  mechanism  by  preventing  tne  future 
occurrence  of  a  forced  assignment  wnicn  will  result  in 
nondeterminism.  Suppose  tne  syntnesizer  is  inspecting  tne 
trace  in  Figure  6,  and  nas  Just  assigned  tne  instruction  at 
level  fc>  a  prefix  of  'l'. 

Notice  that  level  12  is  in  tne  same  couple-class 
as  level  6.  Since  tne  instruction  at  earn  of  tnese  levels  is 
in  tne  same  couple-class,  the  possibility  exists  tnat  tney 
may  be  tne  same  Instruction.  If  tne  instruction  at  level  11 
is  assigned  a  label  of  '2'  wnen  the  wonting  level  reac.nes 
that  part  of  the  trace,  then  the  assignment  at  level  12  will 
be  a  forced  assignment  of  'l'.  However,  an  entry  nas  already 
been  made  in  tne  failure  memory  at  (12,1 >  which  indicates 
that  the  instruction  at  level  12  cannot  be  assigned  a  prefix 
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bacfcup,  dynamic  nonequivalence  processing  causes  an  entry  at 
(11,2)  of  tne  failure  memory  wnicn  corresponds  to  tne 
labelling  of  '2'  given  to  tne  instruction  at  level  5.  Onc° 
tnis  is  accomplisned ,  wnen  tne  wonting  level  descends  to 
level  11,  an  assignment  of  rannot  be  made  and  as  a 
result,  tne  assignment  at  level  12  will  no  longer  be  forced 
by  dynamic  equivalence  wnirn  rives  tne  synthesizer  a  cnance 
to  try  otaer  assignments  t.nat  will  maintain  determinism  rf 
tne  alroritnm. 

Pseudo-assignment  conditions  and  fixup 
conditions  can  occur  in  tne  failure  memory  as  a  result  of 
validation  of  ail  but  one  of  tne  failure  memory  cells  in  a 
row  in  tne  same  manner  tnat  tney  occur  in  difference  set 
resolution.  Additionally,  dynamic  equivalency  and  difference 
set  resolution  can  interact  to  cause  failure  memory  entries 


71 


i 


in  tne  following  manner.  If  a  failure  memory  entry  is  made 
by  difference  set  resolution  at  any  level  wnicn  is  in  tne 
same  couple-class  as  a  level  previously  assigned  a  profix 
label,  and  if  tne  failure  memory  entry  prevents  tne 
assignment  tnat  will  cause  the  instructions  tc  become  part 
of  tne  same  state,  tnen  dynamic  nonequivalence  will  result; 
tnerefore,  an  entry  must  be  made  in  the  failure  memory  to 
indicate  tnis  condition. 

3.  Bacirup/Flxu  p 

Tne  discussion  of  backup  and  fixup  conditions  nas 
been  saved  until  last.  Tne  basic  idea  behind  constructing 
tne  syntneslzer  is  to  provide  as  men  information  as 
possible  to  the  search  mechanism,  and  thereby  direct  the 
label  assignment  with  a  minimal  number  of  retries,  with  this 
in  mind  bacicup  and  fixup  become  last  resorts. 

The  fixup  operation  attempts  to  resolve 
nondeterminism  by  incrementing  the  label  at  tne  current 
wortring  level  wnen  a  contradiction  occurs.  If  the  newly 
incremented  label  is  not  a  legal  assignment  or  does  not 
correct  tne  contradiction,  tnen  bacicup  must  be  initiated. 
Tne  fixup  operation  cannot  be  attempted  if  tne  assignment  at 
tne  current  wortting  level  is  forced  or  if  the  assignment 
created  a  new  state.  In  either  of  tnese  cases,  a  fixup 
operation  would  leave  nondeterminism  in  the  aieorltnm. 

If  a  fixup  fails,  or  cannot  be  attempted,  bacicup  is 
Initiated.  Bacxup  must  be  initiated  from  tne  current  worning 
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level  wnen  any  level  is  discovered  wnich  contains  one  cf 
these  conditions: 


1)  Tne  label  assignment  is  forced  and  tne  failure  memory 
cell  corresponding  to  tnat  level  and  label  is  valid. 

2)  Tne  label  assignment  causes  a  contradiction  ar.d 
represents  a  new  state,  or 

3)  There  is  no  free  state  available  for  tne  instruction 
at  a  particular  level,  and  all  entries  in  tne  failure 
memory  row  at  tnat  level  are  valid. 

Tne  bacttup  begins  at  tne  current  wcriting  level  regardless  cf 
which  level  triggered  the  mechanism,  ana  continues  until 
none  of  tne  three  conditions  given  above  are  present.  At 
tnat  level  a  fixup  operation  is  attempted  ana  tne  searcn 
begins  anew.  Any  entries  into  tne  failure  memory  which  were 
caused  by  levels  greater  tnan  or  equal  to  tne  new  current 
wonting  level  are  invalidated  by  resetting  tne  failure 
memory  entries  to  (0,0).  Additionally,  any  assignments  are 
deleted  along  with  tneir  side-effects,  su~n  as  annotations 
on  forced  assignments  and  new  states.  If  oacicup  causes  tne 
wonting  level  to  be  decremented  to  zero,  a  free  state  is 
ailed  for  tne  use  of  tne  first  instruction  needing  ^cre 
states  tnan  initially  allotted  as  tne  lower  bound. 
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III.  PREPROCESSOR 

A.  PROULEM  SPECIFICATION 

The  program  synthesizer  expects  a  set  of  triples  wner° 
each  triple  is  an  Instruction,  a  condition,  and  an 
instruction.  Siermann  [2J  nas  snown  teat  conditions 
inadvertently  or  purposely  omitted  bv  tne  user  may  re 
inserted  into  a  trace.  The  algorithm  for  insertion  of 
conditions  collects  tne  set  of  atoms  seen  on  the  transitions 
for  an  Instruction.  An  a^om  is  an  entity  whicn  nas  a  value 
of  either  'true'  or  'false'.  A  condition  is  composed  by 
logical  conjunction  and  disjunction  operations  on  atoms.  For 
example,  an  atom  may  be  'c  <-  id',  but  a  condition  may  be  'o 
<=0  t.nd  a  *  4'.  A  set  of  minterms  is  computed  from  tne  set 
of  atoms  and  one  of  tne  minterms  is  inserted  after  eacn 
occurrence  of  that  instruction  in  tne  trace.  If  la,bj  is  a 
set  of  atoms,  tnen  tne  set  of  mmterms  will  te 

{{a,  6),  {-a,tj>,{a,-b},{~a,-b>>  wcere  -  stands  for  lo^i^al 
negation.  It  nas  been  shown  in  reference  li6J  that  only  one 
of  the  minterms  can  be  true  for  ea^n  occurrence  of  a 

transition  from  any  single  instruction. 

One  problem  wltn  tne  algorithm  is  tnat  it  is  incapable 
of  inserting  conditions  if  tee  user  nas  failed  to  supply  ary 
atoms  after  a  particular  instruction.  For  example,  if  tne 
user  should  specify  instruction  Ii  rollowed  by  instruction 
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12  In  one  part  of  tne  trace  ana  instruction  II  followed  rv 


13  in  another  part  of  tne  trace,  but  tne  user  fails  to 
provide  a  condition  after  eitner  occurrence  of  II,  tren  tne 
algoritnm  will  fee  unafele  to  generate  a  condition  for  II.  It 
is  assumed  tnat  II  does  not  appear  witn  an  atom  eisewrere  in 
tne  trace.  Tne  synthesizer  will  force  two  states  for  II  to 
resolve  any  nondeterminism.  Tnis  mecnanism  is  fully 
explained  in  Section  II.  If  conditions  nad  been  supplied  in 
tne  above  example,  tne  difference  in  tne  two  programs  would 
be  tne  number  of  states  assigned  to  instruction  Ii.  Figure 
2b  snows  a  partial  computation  without  explicitly  expressed 
conditions  along  witn  tne  associated  synthesized  program 
fragment.  Figure  25  assumes  that  II  does  not  appear 
elsewnere  in  tne  trace.  Figure  2?  is  a  representation  of  tne 
same  partial  computation  except  tnat  tne  conditions  cl  and 
c2  nave  been  explicitly  expressed.  Tne  computations  in  eotn 
figures  are  tne  same,  and  eac.n  program  fragment  will 
correctly  execute  eitner  trace;  tnerefore,  tne  programs  must 
be  equivalent  programs  witn  respect  to  program  nenavior. 
However  the  program  in  Figure  27  is  minimal  in  that  it 
contains  fewer  states  because  tne  user  explicitly  supplied 
tne  conditions. 
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(S , ... ,11 ,12, ... ,11,13, . . .H) 
Example  Computation 


Syntaesi  z ect  Prorram 

Figure  26.  Computation  without  Explicit  Conditions 

(S,. ...,Il,cl, 12,... ,11,13,. ,.,H) 

Example  Computation 


Synthesized  Program 

Figure  27 .  Computation  with  Explicit  Conditions 


We  intend  to  show  tnat  there  are  mechanisms  wri~n  '■an  he 
used  to  automatically  generate  tne  necessary  conditions  for 
tne  correct  synthesis  of  an  algorithm  produced  by  an  example 
computation  witnout  tne  user  explicitly  defining  them.  Tne 
problem  may  be  described  as  follows.  Given  an  example 
computation  without  explicitly  defined  conditions,  infer 
those  conditions  necessary  to  control  tne  flow  of 
computation  in  a  manner  such  that  tne  synthesized  program 
will  demonstrate  tne  benavior  desired  by  tne  user.  In  order 
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to  facilitate  the  solution  to  tne  problem,  a  conaition  will 
be  viewed  as  a  function  tnat  returns  a  value  of  'true'  cr 
'false'  when  called  ratr.er  tnan  a  logical  operation  on 
atomic  boolean  entities.  Tne  problem  can  tnen  be  thought  of 
as  constructing  a  function. 

Very  little  information  is  available  to  tne  current 
version  of  tne  synthesizer  when  tne  user  provides  only  a 
sequence  of  instructions.  Certainly  not  enougn  to  generate 
minimal  programs  as  described  in  Figure  27.  Tnis  lea  us  to 
searcn  for  other  sources  of  information  that  would  allow  us 
to  construct  tne  necessary  conditions.  We  soon  realized  tnat 
the  instructions  issued  by  the  user  do  not  exist  in  a 
vacuum.  These  instructions  manipulate  data.  If  tne  entire 
computer  memory,  including  registers,  is  viewed  as  tne 
domain  of  interest,  then  execution  of  an  instruction  always 
cnanges  tnis  state.  Intuitively,  the  domain  also  reflects 
the  reason  that  the  user  decided  to  execute  a  particular 
instruction.  A  search  of  a  space  of  tnis  size  in  order  to 
determine  tne  reason  is  impractical;  however,  observing  only 
those  data  elements  affected  by  tne  sequence  of  instructions 
can  often  be  quite  practical  and  can  significantly  reduce 
the  search  space. 

We  cnose  toe  text  editing  domain  as  tne  domain  of 
interest  since  we  felt  that  it  would  be  sufficiently 
interesting  to  warrant  application  of  svntnesis  techniques. 
This  domain  was  selected  because,  first,  tecnniaues 
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developed  in  tais  domain  may  be  general  enougn  for  extension 
into  order  domains,  secondly,  the  world  for  tais  domain  can 
oe  described  as  tne  set  of  all  characters  contained  in  a 
particular  text  file  wnlcn  maxes  tne  world  finite,  and 
finally,  tne  instruction  set  is  small  enouen  to  he 
managea  bie. 

Altaouga  our  primary  researcn  is  directed  toward 
studying  tscnniques  to  apply  to  automatic  condition 
generation,  we  feel  that  tne  syntnesizer  could  be  a  powerful 
text  editor  and  could  provide  some  useful  features  not 
normally  seen  in  conventional  text  editors.  Extended 
features  could  include  tne  anility  to  capitalize  tne  first 
letter  of  every  sentence,  the  anility  to  capitalize  all 
small  letters  In  tne  text,  tne  ability  to  identify  a  string 
and  perform  some  operation  before,  after  or  on  it  ,  or  a nv 
combination  of  tnese  editing  actions. 

Tne  worting  nypotnesis  is  to  aave  tr.e  user  process  tne 
text  file  in  a  normal  manner  and  have  the  syntnesizer  infer 
a  program  from  nis  actions.  Two  requirements  were  levied 
upon  tne  user.  Tne  rirst  requirement  on  tne  user  is  tnat  ne 
must  inform  tne  syntnesizer  when  ne  desires  to  nave  a 
program  generated  so  tnat  tne  syntnesizer  can  tefin 
monitoring  tne  user's  actions.  A  great  deal  of  time  was 
spent  trying  to  figure  out  metnods  tnat  allowed  one  general 
mechanism  to  be  used  to  monitor  tne  user's  actions  and  the 
resulting  cnanges  in  tne  text  file.  Since  we  could  r.ot 
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produce  sued  a  mechanism,  a  second  reauirement  was  levied  on 
tne  user.  This  requirement  recognizes  a  basic  distinction 
between  two  different  aspects  of  text  editing:  context  free 
substitutions,  ana  context  sensitive  substitutions.  tf® 
define  a  context  free  environment  to  De  one  in  wnicn  tne 
cnaracter  to  oe  operated  upon  is  not  dependent  on  characters 
around  it.  Capitalizing  all  occurrences  of  small  letters  is 
an  example  of  a  context  free  operation.  A  context  sensitive 
operation  is  defined  as  an  operation  in  wnicn  tne  action  to 
be  performed  on  a  cnaracter  or  sequence  of  cnaracters 
depends  upon  otner  cnaracters  around  tne  main  character  of 
interest.  Capitalizing  the  first  letter  of  every  sentence  is 
a  context  sensitive  operation.  Condition  inference  in  a 
context  sensitive  environment  is  innerentiy  more  difficult 
tnan  in  a  context  free  environment  in  that  the  condition 
must  be  constructed  from  events  wnicn  require  a  looic-anead 
capability  not  inherent  in  the  synthesizer.  The  user  will  te 
free  to  switch  from  environment  to  environment  at  nis 
convenience.  The  synthesizer  will  create  program  segments 
from  each  environment  wnicn  can  be  used  to  construct  a 
complete  program  by  a  post-processor. 

B.  DESIGN  FOR  A  CONTEXT  FREE  ENVIRONMENT 
1 .  Overview 

Programs  tnat  operate  on  a  single  entity  ran  *e 
constructed  by  the  synthesizer.  Figure  snows  tne 
construction  of  a  program  from  a  trace  intended  to 


communi  ca  te  tnat  trie  letter  "d  *  snoull  be  capitalized 
wnerever  it  appears  in  tne  text  file.  Tne  column  labelled 
'trace'  contains  triples  of  tne  form  instruction,  condition, 
instruction.  B  is  tne  start  instruction,  E  is  tne  mov°  rigrt 
instruction,  C  is  tne  capitalize  or  cnange  instruction  ar.d  S 
is  tne  stop  instruction,  respectively.  Tne  conditions  for 
tnls  trace  are  tne  cnaracters  seen  in  tne  text  file  prior  to 
tne  execution  of  tne  second  instruction  in  earn  triple.  Tne 
special  condition  'V  is  tne  null  condition,  and  is  alwavs 
inserted  after  tne  start  instruction. 

Tne  generated  program  will  correctly  execute  tne 
trace  tftat  was  used  to  construct  it,  and  by  examination  of 
tne  program  it  can  be  snown  tnat  tne  program  will  convert 
all  d's  to  D's  in  a  text  file  consisting  of  tne  cnaracters 
A,  b,  C,  d,  F  and  S .  Tnere  are  no  arcs  available  for  otr.°r 
cnaracters  in  tne  cnaracter  set.  In  order  to  venerate  a 
program  to  perform  tne  same  function  on  an  arbitrary  text 
file,  tne  user  would  be  forced  to  give  ar.  example  of  tne 
desired  transition  for  every  character  in  tne  character  set. 

Since  it  is  desirable  to  relieve  tne  u*er  of  tne 
cnore  of  providing  an  inordinate  number  of  examples  in  order 
to  completely  specify  tne  function,  a  metnod  is  required 
that  utilizes  a  few  examples  of  tne  types  of  conditions  tnat 
are  to  appear  on  tne  arcs  to  generalize  tne  conditions  into 
a  more  compact  and  complete  form.  If  a  generalization  can  be 
found,  tne  multiple  arcs  may  be  replaced  witn  a  more  general 
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condition  and,  therefore ,  correct  programs  can  oe  created 
witn  fewer  examples.  However  tne  combination  of  arcs  cetween 
nodes  must  be  accomplished  so  tftat  determinism  is  maintained 
or  tne  synthesizer  will  not  create  a  miminum  state  machine 
capaoie  of  performing  tne  desired  function.  Tnat  means  tnat 
tne  generalization  technique  must  oe  able  to  handle 
conflicts  properly.  Tne  arcs  in  Figure  2B  tnat  originate  at 
state  R  and  terminate  at  state  R  appear  to  ~or-sist  of 
elements  from  tne  capital  letters  and  small  letters.  Tne 
generalization  of  {x!  x  6  capital  letters)  U  i z !  z  *  s~all 
letters)  would  appear  to  be  a  reasonable  replacement  for  all 
of  tne  R  to  R  arcs.  If  tnis  generalization  was  made  a 
conflict  would  result  because  tne  letter  'd'  is  also  an 
element  of  tne  {z !  z  f  small  letters). 


Trace 

B  0  R 
R  A  R 
R  b  R 
R  C  R 
P  d  C 
C  D  R 
H  F  P 
R  G  P 
R  0  S 


Synthesized  program 


Figure  28.  Synthesizer  ftctior 


Structu  re 


Tne  preprocessor  is  designed  to  accumulate  Knowledge 


from  tne  traces  it  is  provided,  then  use  tne  Knowledge  to 
construct  meaningful  conditions.  The  preprocessor  scans  the 
input  trace  looKing  at  tne  Instructions  and  characters  tnat 


are  seen  before  tne  instructions.  T.ois  pr.ase  extracts  pairs 
of  instructions  from  tne  trace.  Tne  trace  in  Figure  2H  would 
nave  tne  instruction  pairs  (i,R),  ( R ,  R  )  ,  (R,C'  and  ( C  , F  1 


extracted.  Attacned  to  eacn  of  tnese  pairs  is  tne  set  cf 
cnaracters  tnat  were  seen  between  tne  pair.  Tne  preprocessor 
tnen  analyzes  tne  information  to  determine  if  a 
g-eneraii  zati  on  can  be  mane  from  tne  set  of  cnaracters 
associated  witn  eacn  instruction  pair. 

Tne  natural  division  mentioned  above  allows  tne 
preprocessor  tc  be  divided  into  two  modules.  Tne  first 
module  performs  tne  scanning  function  wniie  tne  second 
module  analyzes  tee  information  and  applies  a  neuristi'-  to 
provide  tne  most  general  condition  possible.  Tne 
implementation  of  tne  preprocessor  will  be  discussed  later. 


but  before  it  can  be  discussed  an  explanation  of  tne  data 
structures  required  by  tne  preprocessor  is  needed. 


Preprocessor  Data  Structures 


To  simplify  tne  problem  we  define  two  tvpes  of 


instructions  in  tnis  domain.  Instructions  tnat  specify  tne 
current  location  of  interest  are  cursor  positioning 


Instructions  tnat  change  tne  state 


domain  are  data  manipulation  instructions.  Tne  preprocessor 
accepts  as  input  a  sequence  of  instructions  and  an 
associated  sequence  of  cnaracters.  Tne  first  instruction  in 
tne  instruction  sequence  is  always  tne  start  instruction 
wnicn  does  not  nave  a  character  associated  witn  it.  Tne  last 


instruction  in  tne  sequence  is  alwavs  a  nait  1  ns  true  t or. . 
Every  action  performed  Dv  tne  user  Is  "aptured  ana  apnfried 
to  tne  instruction  sequence  list.  Tne  cnaracter  s=auencp  is 
created  in  narmony  with  tne  instruction  seouence.  In  tre 
quiescent  state  tne  cursor  will  indicate  a  certain  position 
in  the  text.  When  the  user  performs  some  action  suen  as  move 
the  cursor  right,  a  monitor  picKS  up  tne  value  in  tne  old 
position  and  associates  tnat  value  witn  tne  instruction 
executed  hy  tne  user.  For  example,  assume  a  user  has  a  text 
file  in  lower  case  letters  tnat  ne  wants  to  cnar.ge  to  ail 
upper  case  letters.  Tne  user  initiates  the  synthesizer  then 
proceeds  across  tne  line  of  text  cnanging  lower  case  letters 
to  upper  case  letters.  For  the  purpose  of  this  example, 
assume  tne  line  of  text  is  "change  lower  case  to  upper 
case".  As  the  user  roves  across  the  line  mafcin* 
substitutions,  tne  condition  monitor  captures  the  actions 
performed  and  the  characters  seen.  The  example  line  would 
yield  an  instruction  sequence  of  (E,  C,  ?.,  C,  R,  C,  C. 
....  C,  S).  Tne  associated  cnaracter  sequence  would  be;  (c, 
C,  n,  H,  a.  A,  ...,  e,  55).  The  "c”  and  ”r”  in  tne 
instruction  sequence  are  the  capitalize  and  move  right 
instruction,  respectively.  Note  that  tne  capitalize 
instruction  does  not  reposition  the  cursor  and  wnen  tne  user 
moves  tne  cursor  to  tne  rignt,  tne  result  of  tne  capitalize 
instruction  is  associated  witn  the  move. 
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Anotner  data  structure  needed  cy  tne  preprocessor  is 
the  ASCTI  vector.  Tne  ASCII  vector  is  a  128-byte  linear 
array  witn  indices  numbered  0  tnrou*?n  l??.  Eacn  byte  in  tne 
array  is  referenced  cy  tne  decimal  value  or  a  particular 
ASCII  cnaracter.  For  example,  tne  array  element  r°serv»a  for 
tne  ASCII  cnaracter  '0'  is  indexed  cy  4“  decimal.  Tne  arrav 
element  reserved  for  tne  ASCII  cnaracter  'a'  is  indexed  by 
66  decimal.  Tne  vector  defines  a  partition  of  tne  ASCII 
cnaracter  set  cy  usin*  tne  followine  technique.  Tre  ASCII 
cnaracter  .set  nas  Been  divided  into  eight  mutually  exclusive 
suDsets . 

Subset  0  Capital  letters 

Subset  1  Small  letters 

Subset  2  Numbers 

Subset  3  space  character  <$?> 

Subset  4  Symbols 

Subset  5  Punctuation 

Subset  5  Arithmetic  operators 

Sunset  ?  Control  characters 

The  subset  name  is  entered  into  tne  ASCII  vector  at  eacn 
cell  by  converting  tne  ASCII  character  to  its  decimal 
equivalent  and  using  tnat  value  as  tne  array  index.  Tne 
default  partition  is  shown  in  Fieure  29. 
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ASCII  0  1  ...  9  A  B  ...  Z 

Figure  29.  ASCII  Vector 

Tne  cnaracter  set  nierarcny  is  defined  by  tne  tree 
structure  in  Figure  30.  Tne  tree  is  related  to  tne  ASCII 
vector  tnrougn  tne  cnaracter  subset  names  contained,  on  eacn 
node  one  level  above  tne  leaf  nodes.  For  tne  default 
nierarcny  snown  in  Figure  30,  a  zero  would  be  entered  in  tne 
ASCII  vector  for  all  capital  letters,  and  a  1  would  be 
entered  for  all  small  letters.  If  a  different  partition  of 
tne  cnaracter  set  is  required  the  user  can  modify  tne 
nierarcny  or  create  nis  own.  An  example  will  be  given  to 
explain  now  tne  modification  may  De  accomplisned .  Assume  a 
partition  is  desired  wnere  tne  vowels  are  isolated  into  a 
set.  Assume  furtner  tnat  tne  tne  vowels  are  to  be  subdivided 
into  capital  vowels  and  small  vowels.  The  nierarcny  would  be 
modified  bv  placing  a  son  called  'vowels'  on  tne  alpneDetic 
node.  Attach  to  tne  new  node  two  sons,  "ailed  'Cap-vowels' 
and  'Small-vowels',  witn  arcs  to  tne  appropriate  cnaracters. 
Relabel  tne  nierarcny  so  tnat  sibling  relations  are  numbered 
in  increasing  order.  Finally,  initialize  tee  ASCII  vector 
witn  tne  new  labelling.  All  of  tne  modifications  can  be  done 
by  tne  system  when  tne  user  calls  for  tne  modification  Tne 
modified  nierarcny  is  snown  in  Figure  31. 
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Tne  next  data  structure  uses  ry  tne  preprocessor  is 
the  transition  table.  Tne  transition  table  contains  tne 
fcnowledge  gleaned  from  scanning  tne  instruction  sequence  and 
tne  cnaracter  sequence  created  Dy  tne  monitor,  Figure  22 
snows  tne  format  of  tne  transition  table.  Tne  transition 
table  is  an  array  of  records  wita  eacn  record  containing 
information  on  a  transition.  In  tne  table,  II  ana  12  are 
instructions  wnere  12  directly  follows  II  in  at  least  one 
place  in  tne  instruction  sequence.  'Active-sets'  is  a  field 
tnat  contains  information  on  sets  of  characters  tna t  nave 
Deen  o bserved  by  tne  monitor  on  tne  transition  from  Ii  to 
12.  The  fields  'Set-1'  tnrougn  'Set-n'  contain  tne  value  for 
set  name,  tne  count  of  tne  elements  from  tne  set  associated 
wltn  tne  transition  and  a  pointer  to  a  tinned  list  of  tne 
elements.  Tne  records  tnat  would  be  created  for  tne  tra-e 
given  in  Figure  28  would  oe  associated  witn  tne  transitions 

B  to  R,  H  to  3,  H  to  C,  C  to  R  and  R  to  S. 

!  II  !  12  !  Active -sets  !  bet-1  !  Set -2  !  ...  !  Set-n  i 

111  I  I  II  I 

III  I  I  I  I  I 


I  •  I 

I  *  | 

I  *  I 


Figure  32.  Format  of  tne  Transition  Table 
4.  Implementation 

Tne  context  free  preprocessor  consist  of  two  main 
modules;  tne  scanner  and  tne  insertion  modules.  Anotner 
Important  module  not  part  of  tne  preprocessor  is  the  user 
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monitor.  Tns  monitor  gamers  tne  actions  of  tne  user  and 
creates  two  arrays.  One  array  contains  tne  sequence  of 
instructions  tne  user  provided  and  tne  otner  contains 
information  of  wnat  was  true  before  an  instruction  was 
executed.  Tne  information  tnat  is  garnered  is  tnen  passed  to 
tne  appropriate  preprocesso r . 

Tne  example  instruction  and  cr.aracter  sequences 
given  in  Fisrure  33  will  be  tne  example  used  to  explain  tne 
mecnanism  of  tne  preprocessor.  Figure  33  is  illustrative  of 
a  collection  of  actions  tnat  were  performed  by  some  user. 
Tfte  user's  goal  is:  Cnange  all  lower  case  letters  in  a  text 
file  into  upper  case  letters.  Tae  user  nas  activated  tne 
condition  monitor,  positioned  tne  cursor  at  tne  beginning  of 
a  line  of  text  and  moved  rignt  along  tne  line,  cnanglng  me 
lower  case  letters  to  upper  case  wnenever  one  appeared  above 
tne  cursor.  Figure  33  is  an  example  of  output  from  tre 
monitor  assuming  tne  line  tne  user  processed  was  "Tne 
numbers  1,  2,  3,  b,  7  ARE  prime.".  Tne  first  column  m 
Figure  33  is  tn°  cnaracter  array.  It  contains  tne  cnaracter 
under  tne  cursor  prior  to  execution  of  tne  instruction  in 
column  two.  Column  two  is  a  trace  of  tne  actions  performed 
by  tne  user.  Tne  "r"  represents  tne  "move  cursor  rignt" 
Instruction  and  tne  ”c"  represents  a  cnange  witnout  cursor 
reposition  instruction.  Figure  33  can  be  read  as:  Tne 
cnaracter  in  column  one  was  observed  and  tne  instruction  in 
column  two  was  executed. 
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Figure  33.  Monitor  Output 
Tne  scan  module  oi  tne  preprocessor  is  activated 
vnen  tne  user  indicates  tne  representative  example  is 
complete.  Let  'inst-i nlex '  oe  an  index  for  tne  instruction 
array  tnat  is  initialized  to  1.  Tne  first  step  is  to  create 
a  transition  from  tne  start  instruction  to  tne  first 
instruction  in  tne  instruction  array  and  add  tne  transition 
to  tne  transition  table.  Tnls  transition  will  indicate  tne 
beeinnine  of  tne  program  and  will  transition  to  tne  first 
Instruction  provided  on  a  null  condition.  Tne  module  tnen 
moves  down  tne  instruction  array  creatine  otner  transitions 
and  adding  tnem  to  tne  transition  table.  Duplicate 
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transitions  will  not  appear  in  tne  table.  A,  transl  tlon  is 
defined  as  a  pair  (Ii,I2),  II  and  12  are  instructions  and  12 
follows  II  witnin  tne  instruction  array.  Tne  instruction 
array  in  Fieure  33  yields  transitions  (fi,C),  (C,R), 

Tne  transitions  are  constructed  by  indexing  tnroug.n 
tne  instruction  array.  Tne  instruction  at  lnst-mdex  and 
inst-index  +  1  form  a  transition.  Tne  transition  is  tne 
matcn  against  tne  transition  table.  If  a  matcn  occurs,  tne 
cnaracter  in  tne  character  array  at  inst-index  +  1  is 
extracted  and  its  ASCII  value  is  used  to  index  into  tne 
ASCII  vector.  Tne  value  stored  in  tne  ASCII  vector  is  used 
as  an  exponent  for  two  and  stored  in  a  temporary  variable.  A 
bit  by  bit  logical  OR  is  performed  between  tne  temporary 
variable  and  tne  Active-sets  variable  for  tne  transition  and 
tne  result  is  stored  in  Active-sets.  Active-sets  contains 
tne  information  of  every  set  from  tne  partition  tnat  nas 
elements  seen  on  tne  transition.  Tne  operation  described 
above  allocates  one  bit  for  eacn  set  in  tne  partition.  If 
Active-sets  equals  1  tnen  bit  one  of  Active-sets  is  a  1 
signifying  at  least  one  element  of  set  1  nas  been  seen  cn 
tnis  transition.  A  two  would  signify  tnat  see  element  of 
set  two  nad  been  seen  and  a  three  would  signify  tnat  some 
element  of  set  one  and  some  element  of  set  two  nad  been 
seen. 

In  the  transition  table  are  fields  for  eacn  set  that 
nas  been  determined  to  be  active  for  tne  transition.  Witnin 
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eacn  of  tne  set  fields  tnere  are  tnree  subfields,  tne  first 
is  the  set  name,  the  second  is  a  count  of  the  elements  seen 
for  tne  set  and  tne  last  is  a  pointer  to  tne  start  of  a 
circularly  United  list  containing  the  elements  used  from  tne 
set.  Tne  value  tnat  was  obtained  from  tne  ASCII  vector  is 
used  as  a  set  name  and  matcnei  against  eacn  of  tne  set 
fields'  set  name.  If  the  set  name  matches  an  entry  the 
character  at  inst-index  +  1  is  added  to  tne  linked  list  in 
lexicographical  order  if  not  already  on  tne  list  and  tne 
count  is  incremented  by  one.  If  a  matcn  does  not  occur  on 
tne  set  name  a  new  set  field  is  created  and  given  tne  name 
tnat  was  obtained  from  tne  ASCII  vector,  tne  count  is  set  to 
one,  and  tne  character  is  put  on  tne  list. 

When  the  scan  module  reaches  the  end  of  the  input, 
tne  transition  table  contains  an  entry  for  eacn  transition 
that  was  seen.  Eacn  transition  is  associated  with  all  tne 
sets  tnat  nad  elements  seen  wltn  tne  transition.  Finally 
eacn  transition  is  associated  with  tne  actual  elements 
tnrougn  tne  United  list  for  each  set.  The  information  is 
tnen  passed  to  tne  insertion  module  for  analysis.  Figure  34 
snows  the  completed  transition  table  and  tne  United  list  of 
elements  for  eacn  set. 

Once  a  completed  transition  table  has  been  created, 
control  is  passed  to  tne  insertion  module.  Tne  insertion 
module  processes  the  Information  in  the  transition  table  and 
assigns  a  condition  for  eacn  transition. 
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NOTE:  Tne  notation  <1>,  <2>,  etc.  represents  a  pointer  to 
tne  llnted  list  Headed  by  tae  same  symbol. 

Figure  34.  Completed  Transition  Table 
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The  Active-sets  entries  provide  an  efficient 
mecnanism  for  recognizing  potential  conflicts  on  emanating 
arcs.  Performing  a  bit  by  bit  AND  on  tne  Active-sets  entries 
that  nave  a  common  originating  intruction  yields  tne  source 
of  conflicts.  Tne  bit  positions  tnat  are  on  (bit  equals  1) 
are  tne  set  (or  sets)  tnat  nave  had  elements  on  multiple 
transitions.  For  example,  let  (11,12)  and  (11,13)  De  entries 
in  tne  transition  table  witn  Active-sets  value  of  five  (0101 
binary)  and  taree  (0011  binary)  respectively.  Let  Q  equal 
tne  result  of  tne  Dit  by  bit  AND  of  tne  Active-sets  values 
given  above  (i.e.  0001).  Q  indicates  tnat  tnere  is  a 
conflict  between  tne  transition  (11,12)  and  tne  transition 
(11,13).  Furthermore,  3  indicates  tnat  tne  set  causing  tne 
conflict  is  labelled  zero  in  tne  nierarcny  of  Figure  30 
because  tne  on  bit  is  in  tne  rignt  most  position  wnicn 
corresponds  to  two  raised  to  tne  zero  exponent.  Using  tne 
exponent  to  enter  tne  nierarcny,  it  can  be  determined  tnat 
capital  letters  were  seen  on  botn  transitions.  Once  all  tne 
conflicts  for  transitions  with  tne  same  originating 
intruction  are  Known,  tne  conflicts  must  be  resolved  before 
an  assignment  of  conditions  can  be  made. 

Extending  tne  example  giver,  above,  assume  tnat  eignt 
capital  letters  were  seen  on  transition  (11,12)  and  four 
capital  letters  were  seen  on  tne  transition  (11,13).  A 
partial  condition  can  be  constructed  for  the  transition 
(11,12)  as  a  set  difference  between  tne  set  of  capital 
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letters  and  tne  actual  el ements  seen  on  tne  transition 
(11,13).  Tne  partial  condition  for  tne  (11,13)  transition 
becomes  tne  set  of  capital  letters  tnat  were  actually  seen 
witn  tnis  transition.  Tne  initial  conditions  for  t.oese 
transitions  Become  tne  union  of  tne  sets  indicated  in 
Active-sets  as  not  being  in  conflict  and  tne  sets  createi  by 
tne  resolution  of  conficts.  Tnerefore,  tne  condition  for 
(11,12)  is  ({  x  '  x  c  capital  letters)  -  i x ! x  e  capital 
letters  on  otner  transitions))  U  {x|x  «  numeric),  and  tne 
condition  for  (11,13)  becomes  {  z  !  z  c  ({actual  capital 
letters  seen)  0  {small  letters))).  In  tnis  example,  it  was 
assumed  tnat  tne  sets,  numeric  and  snail  letters,  were  an 
appropriate  generalization  for  tne  transition.  In  practice 
it  cannot  be  done  witnout  consideration  of  tne  numoer  of 
elements  tnat  nave  been  seen  from  tne  set  on  tne  transition. 
If  tne  count  field  for  tne  set  exceeds  a  tnresnoid  value  for 
tne  set,  tne  seneralization  may  be  maae ,  otnerwise  tne 
elements  tnemseives  become  tne  partial  condition  for  tne 
transition. 

After  a  condition  nas  been  constructed  for  a 
transition,  a  final  strong  generalization  technique  is 
employed.  Tne  Active-sets  value  for  tne  transition  again 
supplies  tne  starting  point  for  tnis  tecnnique.  Notice 
adjacent  bits  in  Active-sets  correspond  to  adjacent  nodes  in 
tne  nierarcny.  Tnerefore,  a  cnees  is  made  of  tne  Active-sets 
to  see  if  it  nas  adjacent  bits  witn  a  value  of  one.  If  it 
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doss  then  a  generalization  may  oe  attempted.  Assume  tne 
condition  (({capital  letters)  {A  E  I  U  U}^  TJ  (small 

letters}  (J  {numeric}}  nas  been  constructed  for  some 
transition.  Tne  Active-sets  value  for  tms  transition  must 
be  seven  (0111  binary),  With  tne  default  nierarcny  in  Fieure 
22,  a  generalization  to  Alphabetic  and  tnen  to  Aipna-numen c 
would  be  attempted.  Notice  tnat  a  generalization  to 

Aipna-numeri c  would  fail  because  of  a  conflict  witn  anotner 
transition.  Intuitively  ({aipna-numerlc}  -  {A,  E,  I,  3,  U}) 

would  be  a  correct  cnoice  ror  tne  ~ondition  for  tnis 

transition.  A  general  procedure  for  tne  construction  of 
generalized  conditions  is  elven  below. 

A  set  of  nodes  Y  =  {y(  ,  Y2  ,  . ..,  y„  )  is 

eeneralizable  to  a  node  X  if  tne  set  of  node  1  form  a 
complete  and  exnaustive  set  of  leaves  to  tne  subtree  rooted 
at  X.  Furtner,  a  set  of  nodes  Z  =  {z,  ,  z2  ,  . ..,  zm  }  is 
eeneralizabie  to  the  set  *  =  {wf  ,  wa  ,  ...  ,w-  },  J  <  m,  where 

J 

eacn  w  is  a  generalization  of  a  subset  Z. 

IF  the  condition  =  Ff  H  II  ...  U  Fn 
where  R  ®  Z(  -  q{  ,  1  =  l,n 

where  C  Zi  (qt- possibly  null) 

THEN 

tne  condition  is  set  to  w  -  U  q; 

I4i*n  ' 

wnere  W  is  tne  smallest  set 

W  =  {Wj  y  Wj  y  •••  y  Wj  ) 

sucn  tnat  *  generalizes  {z,  y  ...  y  z„} 


C.  DESIGN  FOR  A  CONTEXT  SENSITIVE  ENVIRONMENT 


1.  Overview 

Condition  generation  In  tae  context  sensitive 
environment  is  a  more  difficult  task  tnan  m  tne  context 
free  environment.  Tnis  difficulty  arises  from  tne  scope  of 
knowledge  required  to  mane  decisions  on  wnat  a  condition  is 
to  be.  Tne  conditions  taemseives  are  more  complex  because 
tney  depend  not  only  on  tne  cnaracter  tnat  is  Deing  seen, 
Dut  also  depend  on  characters  tnat  precede  and  follow  the 
current  cnaracter  under  consideration.  Tne  following  example 
will  be  used  to  illustrate  tne  difficulties  and  our  solution 
to  this  problem.  Assume  a  user  wishes  to  capitalize  all 
occurrences  of  tne  word  'time'  in  some  text  file.  Also 
assume  that  the  word  occurs  at  the  beginning,  at  tne  eni, 
and  in  tne  middle  of  sentences  in  tne  text  file.  Tne 
question  is  now  to  construct  a  program  tnat  performs  tne 
desired  function  given  only  tne  actions  tne  user  performs  as 
an  example  of  tne  required  program. 

Tne  assumption  about  the  position  of  the  word  'time' 
in  tne  text  file  implies  tnat  tne  requested  action  needs  to 
be  accomplished  on  strings  that  nave  very  different 
characteristics.  Certainly,  both  'time'  and  'Time'  should  be 
capitalized  as  should  'time,'  ,  'time?'  and  'time<sp>'.  On 
the  other  hand  tne  string  'time'  should  not  be  capitalized 
when  it  occurs  within  a  word  line  'sometime'  or  'timely'. 
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Any  generated  program  tnat  benaves  as  described 
above  must  be  able  to  recognize  an  occurrence  of  tne  string 
or  some  variation  of  tne  string.  Tne  totality  of  tnis 
Information  must  be  glued  togetner  to  provide  a  single 
condition  tnat  is  descriptive  of  what  tne  surrounding 
environment  must  be  lite  before  tne  action  is  performed.  Tne 
implication  is  tnat  tne  condition  itself  must  be  able  to 
perform  cnecsing  and  looit-anead.  In  otner  words,  tne 
condition  for  tne  transition  to  tne  operation  must  in  fact 
oe  a  procedure  wnicn  responds  'true'  whenever  tne  string  of 
interest  is  recognized.  Assume  for  tne  present  tnat  tne 
string  of  interest  can  be  discerned  from  tne  user's  actions, 
(a  nard  problem  by  Itself,  see  Angiuin  [iyj  )  one  must  wonder 
now  sucn  a  procedure  can  be  constructed  and  men  inserted 
into  tne  generated  program  wnicn  performs  tne  function  of  a 
condition  on  some  transition  in  tne  program.  Figure  35  snows 
a  procedure  which  recognizes  tne  word  'time'.  Note  tne 
robustness  of  the  procedure  in  tnat  it  distinguishes  between 
the  differing  occurrences  of  'time'  as  mentioned  above. 
Figure  35  points  out  that  tne  problem  is  r.ot  Just  generating 
a  procedure  as  a  condition  but  also  generating  conditions 
within  tne  procedure  that  is  to  be  tne  overall  condition. 
Tne  arcs  labeled  'T  v  t '  and  '<SP>  v  {punctuation}'  snouid 
be  noted  with  interest  because  they  provide  tne  robustness 
tne  condition  procedure  needs.  Tne  discovery  of  arc  labels 
for  the  condition  procedure  will  be  discussed  next. 
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Figure  35.  Condition  for  "tine"  and  "Time 
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Tne  monitoring  of  user  actions  provides  the 
Instruction  and  cnaracter  sequence  in  tne  same  manner  as 
done  in  tne  context  free  mode.  A  consideration  was  given  to 
require  more  information  be  provided  by  tne  monitor, 
nowever,  tne  notion  was  discarded  because  it  would  reouire 
tne  user  to  be  aware  of  tne  functioning  of  tne  preprocessor. 
Requiring  tne  user  to  provide  information  to  tne  system 
would  betray  our  goal  for  tne  system.  Tne  user  snouid  only 
be  requirei  to  Initiate  tne  system  and  tnen  perform  editing 
as  if  tne  system  was  not  actively  monitoring  nis  actions,  we 
feel  tne  requirement  of  specifying  wnetner  tne  user  wants  to 
perform  context  free  or  context  sensitive  operations  is  tne 
maximum  tnat  snouid  be  asted.  If  it  were  feasible  to 
recognize  tne  difference  between  tne  two  modes  from  tne 
user's  actions  alone,  tnis  limitation  would  be  also  removed. 

Given  only  tne  instruction  sequence,  tne  cnaracter 
sequence,  and  tne  information  of  a  context  sensitive 
environment,  tne  first  assignment  of  tne  context  sensitive 
preprocessor  is  to  discern  tne  string  of  characters  upon 
whlcn  some  operation  is  to  be  performed.  Tnis  is  a  pattern 
recognition  problem  of  considerable  difficulty.  Angluin  [19J 
provides  the  following  theorem,  "There  is  an  effective 
procedure  waicn,  wnen  given  a  sample  S  as  input,  outputs  a 
pattern  p  wnicn  is  descriptive  of  Si".  The  sample  S  is  a 
subset  of  tae  set  of  all  strings  over  tne  alpnabet  of  tne 
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lan<ruaee.  The  effective  procedure  is  computationally 
expensive  and  not  inpiementationaiiy  desirable  for  our 
system.  Tne  procedure  is  an  enumeration  technique  on 
patterns  witn  a  lengtn  less  tnan  tne  snortest  example  in  the 
sample  set  S.  Eacn  of  tne  enumerated  patterns  is  tested  to 
determine  if  it  is  descriptive  of  tne  entire  set  S.  Tne 
longest  pattern  tnat  is  descriptive  of  S  is  tne  most 
specific  pattern  for  tne  set.  Clearly,  as  tne  lengtn  of  tne 
of  tne  sample  grows,  tne  numDer  of  enumerated  patterns  will 
grow  exponentially.  Angluin  [191  states,  "in  tne  general 
case,  tne  test  performed  on  tne  patterns  is  an  NP-compiete 
problem.”.  The  test  sne  is  referring  to  is  tne  cnecic  to  see 
if  the  enumerated  pattern  is  descriptive  of  S. 

For  implementation  purposes,  we  need  a  mechanism 
that  falls  well  snort  of  tne  exponential  cenavior  of  tne 
effective  procedure  mentioned  above.  Tne  text  editing  domain 
nas  two  types  of  instructions  for  tne  purpose  of  this  paper. 
The  first  type  of  instruction  will  he  called  cursor 
positioning  instructions  wniie  tne  second  type  win  re 
called  data  manipulating  instructions.  Assuming  the  text 
file  is  to  be  represented  as  a  linear  array,  only  one  cursor 
position  Instruction  need  concern  us.  All  cursor  positioning 
commands  such  as  move  left,  move  up  or  move  down  can  be 
represented  as  move  right  instructions.  Data  manipulation 
instructions  operate  on  one  character  and  do  not  reposition 


tne  cursor 


Tne  method  we  nave  adopted  for  determining  tne 
string  of  interest  and  tne  context  of  tne  string  is  based  on 
tne  above  definition  of  tne  types  of  instructions  available 
in  tne  text  editing  domain.  Tne  preprocessor  scans  tne 
instruction  sequence  looKine  for  an  occurrence  of  a  data 
manipulation  instruction.  Tne  cnaracter  associated  vitn  tnis 
instruction  is  then  tasen  as  tne  first  cnaracter  of  tne 
string  of  interest.  Otner  cnaracters  are  added  to  tne  string 
by  continuing  tne  scan  until  multiple  occurrences  of  cursor 
positioning  instructions  are  encountered .' A  nypotnesis  is 
tnen  constructed  consisting  of  tnree  parts.  Tne  first  part 
is  tne  beginning  context.  It  is  constructed  from  tne 
cnaracters  tnat  preceded  tne  string  in  tne  cnaracter 
sequence.  Tne  second  part  is  tne  string  itself  and  tne  final 
part  is  tne  ending  context  constructed  from  tne  characters 
seen  after  tne  string.  For  engineering  considerations,  tne 
number  of  characters  in  the  beeinnicfr  and  ending  context 
will  be  limited  to  twenty  cnaracters.  The  probability  of  the 
context  exceeding  twenty  characters  on  botn  sides  of  tne 
strlne  in  tne  text  editing  domain  is  small  enousn  to  ignore. 

Once  a  nypotnesis  is  proposed  it  is  set  aside  as  an 
active  hypothesis  and  scanning  of  the  input  continues.  Otner 
cases  of  data  manipulation  instructions  surrounded  by  cursor 
positioning  instructions  will  result  in  otner  nypotnesis 
being  constructed.  As  tnese  nypotnesis  are  added  to  tne 
active  nypotnesis  list  tney  are  cnecned  for  consistency  and 


it'  tile  new  hypothesis  causes  conflicts  tney  are  resol 
constructing  anotner  nypotnesis  from  tne  confi 
nypotnesis.  To  demonstrate  tnis  mecnanism  we  pres 
example  which  will  illustrate  tne  generation  of  nypo 
and  resolution  into  a  condition  function.  Tne  examp! 
is  tne  construction  of  tne  function  wnicn  will  recopni 
string  'time'. 

Suppose  tne  text  file  contained  tne  foi 
sentences  somewnere  in  tne  file. 

The  time  is  two  ocloce. 

It  is  time  to  go  to  tea. 

Time  tne  runner. 

Did  you  run  out  of  time? 

Also,  suppose  tne  user  nas  specified  tne  environment 
be  context  sensitive  and  nas  beeun  to  perform  actions 
file.  The  monitor  could  create  tne  following  instructl 
character  sequence  fragments  from  tne  user  moving  t 
tne  text  file  and  capitalizing  tnese  occurrences  of  't 

(RRRRCRCRCRCRRRR  ...) 

(Tne  tTi ImtfeS  is  . . . ) 

(RRRRRRCRCRCRCRRRR  . ..) 

(It  is  tTi ImMefi  to  . . . ) 

(RCRCRCRRRRR  ...) 

(TilmMeE  tne  . . . ) 

(...  RRRRRRRRRRRCRCRCRCRR) 

(...  run  out  of  tTilmtfeE?) 


vea  by 
ictir.g 
ent  an 
tneses 
e  usea 
ze  tne 

1 owine 


is  to 
on  tne 
on  and 
nrougn 
ine ' . 
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Tnis  example  is  not  to  imply  tne  user  must  cnange  aii 
occurrences  in  tne  text  file  out  tie  snouid  provide  enougn 
examples  from  ttie  file  to  insure  nis  desires  are  understood. 
If  tne  user  nas  not  supplied  a  distinguishing  set  of 
examples  and  an  incorrect  program  is  generated  ne  may  add  to 
tne  set  of  examples. 

Scanning  tne  first  instruction  sequence  until  tne 
first  data  manipulation  instruction  results  in  tne  string 
'time'  being  constructed.  Tfte  resulting  nypotnesis  is  tnat 
tne  string  'time'  is  witnin  tne  context  of  'Tne<^sp>'  and 
'<sp>  is  two  oclocit.'.  Tne  nypotnesis  may  oe  viewed  as  tne 
following  data  structure. 

Hypotnesis  l: 

Begin  context:  Tne<sp> 

String:  time 

End  context:  <sp>ls  two  ocioctt. 

A  second  nypotnesis  would  be  generated  for  tne  next  portion 
of  tne  instruction  sequence  as  snown  below. 

Hypotnesis  2: 

Begin  context:  It  is<sp> 

String:  time 

End  context:  <sp>to  go  to  bed. 

A  comparison  of  tnese  two  nypotneses  indicates  a 
disagreement  between  tne  contexts.  Tne  conflict  is  resolved 
by  determining  tne  longest  beginning  and  ending  context  tnat 
agree  between  tne  two  nypotneses  and  generate  a  nypotnesis 
reflective  of  tnis  agreement.  By  wording  bacKward  from  tne 
last  cnaracter  in  tne  begin  context  for  botn  nypotneses,  it 
is  possible  to  ascertain  tnat  tne  only  cnaracter  in 
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agreement  is  tne  space.  Wonting  forward  from  tae  first 
cnaracter  in  tae  end  context  for  both  hypotheses,  again  only 
caaracter  in  agreement  is  tne  tae  space.  A  taird  nypotnesis 
witn  tne  new  Degin  and  end  contexts  is  generated  as  follows: 

Hypotaesis  3: 

Begin  context:  <sp> 

String:  time 
End  context:  <sp> 

Tals  aypotnesis  specifies  tnat  tne  string  'time' 
must  be  preceded  and  followed  by  a  space.  Note  tne  test  of 

tne  nypotnesis  implies  tae  user  is  allowed  to  specify  one 

string  during  an  example  computation.  It  is  also  implied 
tnat  tnere  must  be  a  begin  and  an  end  context  for  tae 
string.  Since  it  is  possible  to  nave  two  nypotneses  wnere 
one  of  tae  context  strings  do  not  agree  in  any  of  tne 

cnaracters,  a  metnod  must  exist  to  provide  tne  appropriate 
context . 

Whenever  tne  comparison  between  context  of  two 

nypotneses  results  in  tne  null  string,  a  disjunction  is 
formed  from  tae  characters  immediately  next  to  tne  stria?. 
For  example,  tne  instruction  sequence  given  aoove  would  give 
tne  hypothesis: 

Hypotnesls  4: 

Begin  context:  Did  you  run  out  of<sp> 

String:  time 
End  context:  V 

A  comparison  between  nypotnesis  3  and  nypotnesis  4 
would  result  in  tne  null  string  for  tne  end  context.  Since 
there  must  be  an  end  context,  the  disjuction  of  <sp>  and  ? 
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is  forced  and  tnis  become  tne  end  context  for  tne  new 
Hypothesis.  Generalization  techniques  tnat  were  mentioned  in 
tne  section  on  context  free  environment  aro  tnen  applied  in 
an  attempt  to  reduce  tne  end  context  to  tne  most  general 
context  consistent  wltn  tne  data  seen.  Tne  only  alteration 
in  tne  generalization  scneme  is  tne  lowering  of  tne 
tnreshold  values  for  important  sets.  In  tnis  example,  tne 
tnresnold  value  for  the  punctuation  set  would  be  lowered  to 
1  and  the  ending  context  would  become  {  x!  x=spece  or  x  e 
(Punctuation) } . 

Tne  final  problem  to  be  solved  is  tne  recognition  of 
variations  in  a  string.  Examples  of  variations  of  a  string 
are,  'Time'  and  'time',  or  'enclosure'  and  'inclosure'.  As 
mentioned,  if  tne  user  intends  to  capitalize  ail  occurrences 
of  'time',  'Time'  is  to  be  included.  Note  tnese  variations 
of  tne  strine  become  tne  compound  labels  for  tne  arcs  in 
Fltrure  35.  Tne  system  includes  a  rule  that  enables  the 
recognition  of  variations  of  strings  provided  tne  user  gives 
an  example  of  tne  variation.  The  rule  simply  states  tnat  tne 
string  length  will  be  established  to  be  as  long  as  tne 
longest  string  encountered  during  processing.  Again,  using 
the  example,  the  hypothesis  ror  'Time  the  runner.'  would  be: 

Hypothesis  5! 

Beeln  context :  ...  T 

String:  lme 

End  context:  <sp>tne  runner. 

It  has  been  establisned  by  preceding  user  actions 
tnat  tne  string  lengtn  for  tne  nypotnesis  snouid  be  4.  By 


^atcnin?  the  pattern  in  hypothesis  5  with  the  string  irom 
nypotnesis  4  it  can  be  determined  that  tne  string  in 
Hypothesis  b  should  be  expanded  Dy  inserting  a  'T'  in  front 
of  tne  string.  Anotner  nypotnesis  is  tnen  generated  wr.ere 
tne  string  will  ce  tne  disjuction  between  tne  strings  'time' 
and  'Time'.  Tne  final  nypotnesis  from  the  example  would  then 
oe: 

Hypotnesis  o: 

Begin  context:  <sp> 

String:  'time'  v  'Time' 

End  context:  1  x|  x  =  space  or  x  c  Pune.} 

Once  tnis  nypotnesis  nas  teen  generated,  it  is  tnen 
used  to  examine  tne  input  for  negative  examples  that  can 
strengtnen  or  weaken  tne  nypotnesis.  Suppose  tne  input 
contained  the  fragment  ”...  timely  results...”  .  Processing 
the  input  with  Hypotnesis  6  would  snow  a  maten  for  tne 
string,  but  tne  end  context  would  not  agree;  therefore,  tne 
nypotnesis  will  be  strengthened  by  changing  tne  end  context 
as  snown  below: 

Final  Hypothesis: 

Begin  context:  <sp> 

String:  'time'  or  'Tine' 

End  context:  ix|x=space  v 
x  c  Pune .  5> 
x  e  small  letters} 

After  the  input  nas  been  processed  and  a  final 
nypotnesis  proposed,  tne  nypotnesis  is  used  to  construct  a 


procedure 

such 

as 

shown 

in  Figure  35.  Tne 

first  part 

o  f 

tne 

procedure 

to 

be 

cons  tructed 

is  the 

transitions 

for 

tne 

beginning 

context 

.  Tne 

states 

in  tne 

procedure 

are 
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instructions  In  tne  instruction  set,  and  tfte  arc  labels 
consist  of  tne  Information  in  tne  final  hypothesis.  4  start 
state  is  placed  in  tne  procedure  vitn  an  arc  to  a  move  rignt 
instruction  (R).  Since  tne  procedure  is  a  string  matcn  or 
loon-anead  routine  all  states  otner  tnan  tne  start  state 
will  be  move  right  instructions.  Each  of  tne  states  will 
nave  two  arcs  exiting  tnem.  Tne  labels  on  tnese  two  arcs 
will  be  tne  negation  of  tne  eacn  otner. 

Tne  construction  is  accomplisned  by  placing  tne 
first  cnaracter  of  tne  begin  context  on  tne  exiting  arc 
going  to  a  new  move  rignt  state.  Tne  otner  arc  is  labeled 
witn  tne  negation  of  tne  cnaracter  and  tnis  arc  terminates 
at  tne  first  move  rignt  state.  Sacn  cnaracter  of  tne  begin 
context  creates  anotner  move  rignt  state  labeled  as 
mentioned. 

Tne  string  from  tne  nypotnesis  is  tnen  used  to 
complete  tne  procedure  tftat  nas  been  partially  constructed. 
If  tne  string  is  composed  of  disjunctions,  tne  cnaracters 
are  used  to  form  disjunctions.  Eacn  of  tne  disjunctions  are 
combined  witn  conjunctions.  Tne  final  nypo  tnesis  above 
provides  a  string  of  'time'  or  'Time'.  Tne  conjunction  of 
disjunctions  will  be  formed  as: 

('T'  v  't')  &  ('i'  v  'i')  &  ('m'  v  'm')  &  ('e'  v  'e') 

Upon  reduction  the  string  will  be  expressed  as: 

('T'  v  't')  &  '1'  &  'm'  &  'e' 

Eacn  disjunction  becomes  a  label  on  an  arc  to  a  new  move 
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rignt  state  anl  tne  negation  oecomes  tne  laDei  on  ar.  arc 
oacJt  to  tne  original  move  rignt  state. 

Finally,  tne  end  context  is  added  in  tne  same  manner 
as  tne  begin  context.  Tne  first  cnaracter  oecomes  tne  label 
on  tne  last  move  rignt  state  created  from  tne  string  and  new 
states  are  aided  for  eacn  cnaracter  in  tne  end  context.  Tne 
result  of  tnese  operations  is  displayed  in  Figure  35. 
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IV.  CONCLUSIONS  AND  RECOMv£N CAT  IONS 


A.  STNTHESIZER 

Tne  syntnesizer  tnat  nas  been  implemented  for  tnis 
thesis  will  produce  programs  from  example  computations  in  a 
reasonaDie  amount  of  time.  Tne  system  response  for  most  of 
tne  traces  was  within  10  seconds  on  a  Digital  Equipment 
Corporation  PDP-ll/fO  minicomputer.  Tne  response  time  is  a 
function  of  tne  iengtn  of  tne  trace  and  tne  numcer  of 
multiple  occurrences  of  a  particular  instruction  or  set  of 
instructions  in  tne  final  algorithm,  with  multiple 
occurrences  of  an  instruction  affecting  response  time  tne 
most.  As  Blermann  [17J  nas  noted,  tnis  has  a  nice 
implication  for  programming  by  example  because  most 
algorithms  do  not  exnibit  tne  characteristic  of  having  a 
large  number  of  instances  of  tne  same  instruction.  In  other 
words,  almost  all  multiple  occurrences  of  an  instruction  in 
an  input  trace  are  indicative  of  a  loop  in  the  algorithm. 

In  all  of  tne  test  cases  except  those  tnat  requirec  a 
large  amount  of  bacnups,  static  processing  accountec  for  at 
least  naif  of  tne  total  response  time.  Future  modifications 
to  tne  syntnesizer  wnicn  would  decrease  tne  total  response 
time  could  be  directed  toward  designing  tne  static 
processing  stage  more  efficiently.  However,  tne  trade-off 
between  static  processing  and  dynamic  processing  must  be 


*ept  in  perspective.  Static  processing  is  a  linear  function 
of  tne  length  of  tne  trace,  whereas  dynamic  processing, 
since  it  is  an  enumerative  searcn  tecnnique,  is  an 
exponential  function  of  tne  lengtn  of  tne  trace. 

Another  area  which  should  he  considered  is  the  dynamic 
processing  stage.  Tnere  exists  a  pietnora  of  researcn 
auestions  within  this  area.  The  primary  one  being:  Can  mere 
information  be  gleaned  from  tne  input  trace  during  static 
processing  wnicn  will  decrease  the  searcn  time  for  dynamic 
processing?  Difference  sets  and  couple-classes  provide  sotp 
powerful  mecnanisms  for  decreasing  tne  amount  of  searcn; 
nowever,  lower  bounds  computations  on  the  numoer  of  states 
required  by  tne  macnine  often  increase  tne  amount  of  searcn. 
Lower  bounds  are  restrictive  in  nature.  They  are  designed  to 
force  tne  final  algorithm  into  a  minimum  state  configuration 
which,  in  many  cases,  causes  extra  search  time.  Relaxation 
of  the  lower  bounds  computation  will  result  in  a  final 
algorltnm  wnicn  may  not  be  expressed  in  a  minimum  number  of 
states,  but  which  will  still  oe  deterministic.  Tnere  right 
he  better  methods  of  initially  computing  tne  nu-roer  of 
states  which  would  result  in  a  closer  estimate  of  tne  actual 
number  of  states  required  for  tne  algorithm.  Obviously,  the 
closer  tne  initial  guess  is  to  tne  actual  requirement,  tne 
less  baefcup  incurred,  and,  tnerefore,  the  less  searcn  tiTe 
requl red. 


Since  tne  amount  of  searcn  require!  is  governed  Dy  tne 
failure  memory  entries,  tne  more  dense  tne  failure  memory 
can  be  male,  tne  more  directed  tne  searcn  becomes.  So 
anotner  area  for  researcn  is  to  determine  if  more 
information  exists  in  tne  failure  memory  entries  tnan  is 
currently  being  used.  How  nucn  information  do  tne  structure 
factor  and  the  free  state  factor  provide?  Is  tnere  anotner 
factor  wnicn  would  be  useful? 

Finally,  a  more  general  question  can  oe  addressed.  Tre 
underlying  structure  of  this  tecnnique  is  an  enumerative 
searcn.  Can  the  tecnnique  be  generalized  to  include  otner 
algoritnms  wnicn  are  enumerative  in  nature?  What 
modifications  to  tne  failure  memory  are  needed?  How  would 
difference  sets  and  couple-classes  be  redefined? 

3.  CONDITION  PROCESSING 

The  condition  processor  front-end  to  the  synthesizer 
relieves  tne  user  from  worrying  about  some  of  tne  control 
structure  considerations  by  automatically  generating 
conditions.  Anotner  addition  wnicn  would  increase  tr.e  power 
of  tne  syntnesizer  Is  an  automatic  loop  variable  generator 
as  discussed  by  Biermann  L10J  .  Altnougn  tne  text  editing 
environment  nas  been  used  in  tnis  tnesis  wors,  tne  part  of 
tne  condition  processor  design  wnich  deals  witn  a  context 
free  environment  is  general  enougn  tnat  it  could  be  designed 
to  operate  in  any  domain. 
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Condition  generation  in  a  context  sensitive  environment 
is  a  mucn  harder  problem  furtner  complicated  by  requisite 
pattern  matching  ana  pattern  generation.  Before  inis  type  cf 
condition . veneration  can  be  generalized,  mucn  wont  nas  to  te 
done  to  incr'eas-e  tne  efficiency  of  pattern  veneration 
scnemes.  Angiuin  [iyj  nas  snown  a  pattern  generation  scneme 
which  is  a  polynomial  time  algorithm  for  pattern  veneration 
with  one  variaoie,  out  tne  domain  we  nave  examined  will 
require  at  least  two  variables.  Tnere  is  not  a  polynomial 
time  algorltnm  for  pattern  generation  witn  two  variables. 
Heuristic  tecnniques  will  probably  be  necessary  to  provide 
methods  of  pattern  generation  wnicn  will  be  fast  enouvn  to 
be  useful  over  a  wide  range  of  problems. 
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